Search Results: "js"

28 January 2024

Niels Thykier: Annotating the Debian packaging directory

In my previous blog post Providing online reference documentation for debputy, I made a point about how debhelper documentation was suboptimal on account of being static rather than online. The thing is that debhelper is not alone in this problem space, even if it is a major contributor to the number of packaging files you have to to know about. If we look at the "competition" here such as Fedora and Arch Linux, they tend to only have one packaging file. While most Debian people will tell you a long list of cons about having one packaging file (such a Fedora's spec file being 3+ domain specific languages "mashed" into one file), one major advantage is that there is only "the one packaging file". You only need to remember where to find the documentation for one file, which is great when you are running on wetware with limited storage capacity. Which means as a newbie, you can dedicate less mental resources to tracking multiple files and how they interact and more effort understanding the "one file" at hand. I started by asking myself how can we in Debian make the packaging stack more accessible to newcomers? Spoiler alert, I dug myself into rabbit hole and ended up somewhere else than where I thought I was going. I started by wanting to scan the debian directory and annotate all files that I could with documentation links. The logic was that if debputy could do that for you, then you could spend more mental effort elsewhere. So I combined debputy's packager provided files detection with a static list of files and I quickly had a good starting point for debputy-based packages.
Adding (non-static) dpkg and debhelper files to the mix Now, I could have closed the topic here and said "Look, I did debputy files plus couple of super common files". But I decided to take it a bit further. I added support for handling some dpkg files like packager provided files (such as debian/substvars and debian/symbols). But even then, we all know that debhelper is the big hurdle and a major part of the omission... In another previous blog post (A new Debian package helper: debputy), I made a point about how debputy could list all auxiliary files while debhelper could not. This was exactly the kind of feature that I would need for this feature, if this feature was to cover debhelper. Now, I also remarked in that blog post that I was not willing to maintain such a list. Also, I may have ranted about static documentation being unhelpful for debhelper as it excludes third-party provided tooling. Fortunately, a recent update to dh_assistant had provided some basic plumbing for loading dh sequences. This meant that getting a list of all relevant commands for a source package was a lot easier than it used to be. Once you have a list of commands, it would be possible to check all of them for dh's NOOP PROMISE hints. In these hints, a command can assert it does nothing if a given pkgfile is not present. This lead to the new dh_assistant list-guessed-dh-config-files command that will list all declared pkgfiles and which helpers listed them. With this combined feature set in place, debputy could call dh_assistant to get a list of pkgfiles, pretend they were packager provided files and annotate those along with manpage for the relevant debhelper command. The exciting thing about letting debpputy resolve the pkgfiles is that debputy will resolve "named" files automatically (debhelper tools will only do so when --name is passed), so it is much more likely to detect named pkgfiles correctly too. Side note: I am going to ignore the elephant in the room for now, which is dh_installsystemd and its package@.service files and the wide-spread use of debian/foo.service where there is no package called foo. For the latter case, the "proper" name would be debian/pkg.foo.service. With the new dh_assistant feature done and added to debputy, debputy could now detect the ubiquitous debian/install file. Excellent. But less great was that the very common debian/docs file was not. Turns out that dh_installdocs cannot be skipped by dh, so it cannot have NOOP PROMISE hints. Meh... Well, dh_assistant could learn about a new INTROSPECTABLE marker in addition to the NOOP PROMISE and then I could sprinkle that into a few commands. Indeed that worked and meant that debian/postinst (etc.) are now also detectable. At this point, debputy would be able to identify a wide range of debhelper related configuration files in debian/ and at least associate each of them with one or more commands. Nice, surely, this would be a good place to stop, right...?
Adding more metadata to the files The debhelper detected files only had a command name and manpage URI to that command. It would be nice if we could contextualize this a bit more. Like is this file installed into the package as is like debian/pam or is it a file list to be processed like debian/install. To make this distinction, I could add the most common debhelper file types to my static list and then merge the result together. Except, I do not want to maintain a full list in debputy. Fortunately, debputy has a quite extensible plugin infrastructure, so added a new plugin feature to provide this kind of detail and now I can outsource the problem! I split my definitions into two and placed the generic ones in the debputy-documentation plugin and moved the debhelper related ones to debhelper-documentation. Additionally, third-party dh addons could provide their own debputy plugin to add context to their configuration files. So, this gave birth file categories and configuration features, which described each file on different fronts. As an example, debian/gbp.conf could be tagged as a maint-config to signal that it is not directly related to the package build but more of a tool or style preference file. On the other hand, debian/install and debian/debputy.manifest would both be tagged as a pkg-helper-config. Files like debian/pam were tagged as ppf-file for packager provided file and so on. I mentioned configuration features above and those were added because, I have had a beef with debhelper's "standard" configuration file format as read by filearray and filedoublearray. They are often considered simple to understand, but it is hard to know how a tool will actually read the file. As an example, consider the following:
  • Will the debhelper use filearray, filedoublearray or none of them to read the file? This topic has about 2 bits of entropy.
  • Will the config file be executed if it is marked executable assuming you are using the right compat level? If it is executable, does dh-exec allow renaming for this file? This topic adds 1 or 2 bit of entropy depending on the context.
  • Will the config file be subject to glob expansions? This topic sounds like a boolean but is a complicated mess. The globs can be handled either by debhelper as it parses the file for you. In this case, the globs are applied to every token. However, this is not what dh_install does. Here the last token on each line is supposed to be a directory and therefore not subject to globs. Therefore, dh_install does the globbing itself afterwards but only on part of the tokens. So that is about 2 bits of entropy more. Actually, it gets worse...
    • If the file is executed, debhelper will refuse to expand globs in the output of the command, which was a deliberate design choice by the original debhelper maintainer took when he introduced the feature in debhelper/8.9.12. Except, dh_install feature interacts with the design choice and does enable glob expansion in the tool output, because it does so manually after its filedoublearray call.
So these "simple" files have way too many combinations of how they can be interpreted. I figured it would be helpful if debputy could highlight these difference, so I added support for those as well. Accordingly, debian/install is tagged with multiple tags including dh-executable-config and dh-glob-after-execute. Then, I added a datatable of these tags, so it would be easy for people to look up what they meant. Ok, this seems like a closed deal, right...?
Context, context, context However, the dh-executable-config tag among other are only applicable in compat 9 or later. It does not seem newbie friendly if you are told that this feature exist, but then have to read in the extended description that that it actually does not apply to your package. This problem seems fixable. Thanks to dh_assistant, it is easy to figure out which compat level the package is using. Then tweak some metadata to enable per compat level rules. With that tags like dh-executable-config only appears for packages using compat 9 or later. Also, debputy should be able to tell you where packager provided files like debian/pam are installed. We already have the logic for packager provided files that debputy supports and I am already using debputy engine for detecting the files. If only the plugin provided metadata gave me the install pattern, debputy would be able tell you where this file goes in the package. Indeed, a bit of tweaking later and setting install-pattern to usr/lib/pam.d/ name , debputy presented me with the correct install-path with the package name placing the name placeholder. Now, I have been using debian/pam as an example, because debian/pam is installed into usr/lib/pam.d in compat 14. But in earlier compat levels, it was installed into etc/pam.d. Well, I already had an infrastructure for doing compat file tags. Off we go to add install-pattern to the complat level infrastructure and now changing the compat level would change the path. Great. (Bug warning: The value is off-by-one in the current version of debhelper. This is fixed in git) Also, while we are in this install-pattern business, a number of debhelper config files causes files to be installed into a fixed directory. Like debian/docs which causes file to be installed into /usr/share/docs/ package . Surely, we can expand that as well and provide that bit of context too... and done. (Bug warning: The code currently does not account for the main documentation package context) It is rather common pattern for people to do debian/foo.in files, because they want to custom generation of debian/foo. Which means if you have debian/foo you get "Oh, let me tell you about debian/foo ". Then you rename it to debian/foo.in and the result is "debian/foo.in is a total mystery to me!". That is suboptimal, so lets detect those as well as if they were the original file but add a tag saying that they are a generate template and which file we suspect it generates. Finally, if you use debputy, almost all of the standard debhelper commands are removed from the sequence, since debputy replaces them. It would be weird if these commands still contributed configuration files when they are not actually going to be invoked. This mostly happened naturally due to the way the underlying dh_assistant command works. However, any file mentioned by the debhelper-documentation plugin would still appear unfortunately. So off I went to filter the list of known configuration files against which dh_ commands that dh_assistant thought would be used for this package.
Wrapping it up I was several layers into this and had to dig myself out. I have ended up with a lot of data and metadata. But it was quite difficult for me to arrange the output in a user friendly manner. However, all this data did seem like it would be useful any tool that wants to understand more about the package. So to get out of the rabbit hole, I for now wrapped all of this into JSON and now we have a debputy tool-support annotate-debian-directory command that might be useful for other tools. To try it out, you can try the following demo: In another day, I will figure out how to structure this output so it is useful for non-machine consumers. Suggestions are welcome. :)
Limitations of the approach As a closing remark, I should probably remind people that this feature relies heavily on declarative features. These include:
  • When determining which commands are relevant, using Build-Depends: dh-sequence-foo is much more reliable than configuring it via the Turing complete configuration we call debian/rules.
  • When debhelper commands use NOOP promise hints, dh_assistant can "see" the config files listed those hints, meaning the file will at least be detected. For new introspectable hint and the debputy plugin, it is probably better to wait until the dust settles a bit before adding any of those.
You can help yourself and others to better results by using the declarative way rather than using debian/rules, which is the bane of all introspection!

26 January 2024

Bastian Venthur: Investigating popularity of Python build backends over time

Inspired by a Mastodon post by Fran oise Conil, who investigated the current popularity of build backends used in pyproject.toml files, I wanted to investigate how the popularity of build backends used in pyproject.toml files evolved over the years since the introduction of PEP-0517 in 2015. Getting the data Tom Forbes provides a huge dataset that contains information about every file within every release uploaded to PyPI. To get the current dataset, we can use:
curl -L --remote-name-all $(curl -L "https://github.com/pypi-data/data/raw/main/links/dataset.txt")
This will download approximately 30GB of parquet files, providing detailed information about each file included in a PyPI upload, including:
  1. project name, version and release date
  2. file path, size and line count
  3. hash of the file
The dataset does not contain the actual files themselves though, more on that in a moment. Querying the dataset using duckdb We can now use duckdb to query the parquet files directly. Let s look into the schema first:
describe select * from '*.parquet';
 
    column_name     column_type    null    
      varchar         varchar     varchar  
 
  project_name      VARCHAR       YES      
  project_version   VARCHAR       YES      
  project_release   VARCHAR       YES      
  uploaded_on       TIMESTAMP     YES      
  path              VARCHAR       YES      
  archive_path      VARCHAR       YES      
  size              UBIGINT       YES      
  hash              BLOB          YES      
  skip_reason       VARCHAR       YES      
  lines             UBIGINT       YES      
  repository        UINTEGER      YES      
 
  11 rows                       6 columns  
 
From all files mentioned in the dataset, we only care about pyproject.toml files that are in the project s root directory. Since we ll still have to download the actual files, we need to get the path and the repository to construct the corresponding URL to the mirror that contains all files in a bunch of huge git repositories. Some files are not available on the mirrors; to skip these, we only take files where the skip_reason is empty. We also care about the timestamp of the upload (uploaded_on) and the hash to avoid processing identical files twice:
select
    path,
    hash,
    uploaded_on,
    repository
from '*.parquet'
where
    skip_reason == '' and
    lower(string_split(path, '/')[-1]) == 'pyproject.toml' and
    len(string_split(path, '/')) == 5
order by uploaded_on desc
This query runs for a few minutes on my laptop and returns ~1.2M rows. Getting the actual files Using the repository and path, we can now construct an URL from which we can fetch the actual file for further processing:
url = f"https://raw.githubusercontent.com/pypi-data/pypi-mirror- repository /code/ path "
We can download the individual pyproject.toml files and parse them to read the build-backend into a dictionary mapping the file-hash to the build backend. Downloads on GitHub are rate-limited, so downloading 1.2M files will take a couple of days. By skipping files with a hash we ve already processed, we can avoid downloading the same file more than once, cutting the required downloads by circa 50%. Results Assuming the data is complete and my analysis is sound, these are the findings: There is a surprising amount of build backends in use, but the overall amount of uploads per build backend decreases quickly, with a long tail of single uploads:
>>> results.backend.value_counts()
backend
setuptools        701550
poetry            380830
hatchling          56917
flit               36223
pdm                11437
maturin             9796
jupyter             1707
mesonpy              625
scikit               556
                   ...
postry                 1
tree                   1
setuptoos              1
neuron                 1
avalon                 1
maturimaturinn         1
jsonpath               1
ha                     1
pyo3                   1
Name: count, Length: 73, dtype: int64
We pick only the top 4 build backends, and group the remaining ones (including PDM and Maturin) into other so they are accounted for as well. The following plot shows the relative distribution of build backends over time. Each bin represents a time span of 28 days. I chose 28 days to reduce visual clutter. Within each bin, the height of the bars corresponds to the relative proportion of uploads during that time interval: Relative distribution of build backends over time Looking at the right side of the plot, we see the current distribution. It confirms Fran oise s findings about the current popularity of build backends: Between 2018 and 2020 the graph exhibits significant fluctuations, due to the relatively low amount uploads utizing pyproject.toml files. During that early period, Flit started as the most popular build backend, but was eventually displaced by Setuptools and Poetry. Between 2020 and 2020, the overall usage of pyproject.toml files increased significantly. By the end of 2022, the share of Setuptools peaked at 70%. After 2020, other build backends experienced a gradual rise in popularity. Amongh these, Hatch emerged as a notable contender, steadily gaining traction and ultimately stabilizing at 10%. We can also look into the absolute distribution of build backends over time: Absolute distribution of build backends over time The plot shows that Setuptools has the strongest growth trajectory, surpassing all other build backends. Poetry and Hatch are growing at a comparable rate, but since Hatch started roughly 4 years after Poetry, it s lagging behind in popularity. Despite not being among the most widely used backends anymore, Flit maintains a steady and consistent growth pattern, indicating its enduring relevance in the Python packaging landscape. The script for downloading and analyzing the data can be found in my GitHub repository. It contains the results of the duckb query (so you don t have to download the full dataset) and the pickled dictionary, mapping the file hashes to the build backends, saving you days for downloading and analyzing the pyproject.toml files yourself.

25 January 2024

Joachim Breitner: GHC Steering Committee Retrospective

After seven years of service as member and secretary on the GHC Steering Committee, I have resigned from that role. So this is a good time to look back and retrace the formation of the GHC proposal process and committee. In my memory, I helped define and shape the proposal process, optimizing it for effectiveness and throughput, but memory can be misleading, and judging from the paper trail in my email archives, this was indeed mostly Ben Gamari s and Richard Eisenberg s achievement: Already in Summer of 2016, Ben Gamari set up the ghc-proposals Github repository with a sketch of a process and sent out a call for nominations on the GHC user s mailing list, which I replied to. The Simons picked the first set of members, and in the fall of 2016 we discussed the committee s by-laws and procedures. As so often, Richard was an influential shaping force here.

Three ingredients For example, it was him that suggested that for each proposal we have one committee member be the Shepherd , overseeing the discussion. I believe this was one ingredient for the process effectiveness: There is always one person in charge, and thus we avoid the delays incurred when any one of a non-singleton set of volunteers have to do the next step (and everyone hopes someone else does it). The next ingredient was that we do not usually require a vote among all members (again, not easy with volunteers with limited bandwidth and occasional phases of absence). Instead, the shepherd makes a recommendation (accept/reject), and if the other committee members do not complain, this silence is taken as consent, and we come to a decision. It seems this idea can also be traced back on Richard, who suggested that once a decision is requested, the shepherd [generates] consensus. If consensus is elusive, then we vote. At the end of the year we agreed and wrote down these rules, created the mailing list for our internal, but publicly archived committee discussions, and began accepting proposals, starting with Adam Gundry s OverloadedRecordFields. At that point, there was no secretary role yet, so how I did become one? It seems that in February 2017 I started to clean-up and refine the process documentation, fixing bugs in the process (like requiring authors to set Github labels when they don t even have permissions to do that). This in particular meant that someone from the committee had to manually handle submissions and so on, and by the aforementioned principle that at every step there ought to be exactly one person in change, the role of a secretary followed naturally. In the email in which I described that role I wrote:
Simon already shoved me towards picking up the secretary hat, to reduce load on Ben.
So when I merged the updated process documentation, I already listed myself secretary . It wasn t just Simon s shoving that put my into the role, though. I dug out my original self-nomination email to Ben, and among other things I wrote:
I also hope that there is going to be clear responsibilities and a clear workflow among the committee. E.g. someone (possibly rotating), maybe called the secretary, who is in charge of having an initial look at proposals and then assigning it to a member who shepherds the proposal.
So it is hardly a surprise that I became secretary, when it was dear to my heart to have a smooth continuous process here. I am rather content with the result: These three ingredients single secretary, per-proposal shepherds, silence-is-consent helped the committee to be effective throughout its existence, even as every once in a while individual members dropped out.

Ulterior motivation I must admit, however, there was an ulterior motivation behind me grabbing the secretary role: Yes, I did want the committee to succeed, and I did want that authors receive timely, good and decisive feedback on their proposals but I did not really want to have to do that part. I am, in fact, a lousy proposal reviewer. I am too generous when reading proposals, and more likely mentally fill gaps in a specification rather than spotting them. Always optimistically assuming that the authors surely know what they are doing, rather than critically assessing the impact, the implementation cost and the interaction with other language features. And, maybe more importantly: why should I know which changes are good and which are not so good in the long run? Clearly, the authors cared enough about a proposal to put it forward, so there is some need and I do believe that Haskell should stay an evolving and innovating language but how does this help me decide about this or that particular feature. I even, during the formation of the committee, explicitly asked that we write down some guidance on Vision and Guideline ; do we want to foster change or innovation, or be selective gatekeepers? Should we accept features that are proven to be useful, or should we accept features so that they can prove to be useful? This discussion, however, did not lead to a concrete result, and the assessment of proposals relied on the sum of each member s personal preference, expertise and gut feeling. I am not saying that this was a mistake: It is hard to come up with a general guideline here, and even harder to find one that does justice to each individual proposal. So the secret motivation for me to grab the secretary post was that I could contribute without having to judge proposals. Being secretary allowed me to assign most proposals to others to shepherd, and only once in a while myself took care of a proposal, when it seemed to be very straight-forward. Sneaky, ain t it?

7 Years later For years to come I happily played secretary: When an author finished their proposal and public discussion ebbed down they would ping me on GitHub, I would pick a suitable shepherd among the committee and ask them to judge the proposal. Eventually, the committee would come to a conclusion, usually by implicit consent, sometimes by voting, and I d merge the pull request and update the metadata thereon. Every few months I d summarize the current state of affairs to the committee (what happened since the last update, which proposals are currently on our plate), and once per year gathered the data for Simon Peyton Jones annually GHC Status Report. Sometimes some members needed a nudge or two to act. Some would eventually step down, and I d sent around a call for nominations and when the nominations came in, distributed them off-list among the committee and tallied the votes. Initially, that was exciting. For a long while it was a pleasant and rewarding routine. Eventually, it became a mere chore. I noticed that I didn t quite care so much anymore about some of the discussion, and there was a decent amount of naval-gazing, meta-discussions and some wrangling about claims of authority that was probably useful and necessary, but wasn t particularly fun. I also began to notice weaknesses in the processes that I helped shape: We could really use some more automation for showing proposal statuses, notifying people when they have to act, and nudging them when they don t. The whole silence-is-assent approach is good for throughput, but not necessary great for quality, and maybe the committee members need to be pushed more firmly to engage with each proposal. Like GHC itself, the committee processes deserve continuous refinement and refactoring, and since I could not muster the motivation to change my now well-trod secretarial ways, it was time for me to step down. Luckily, Adam Gundry volunteered to take over, and that makes me feel much less bad for quitting. Thanks for that! And although I am for my day job now enjoying a language that has many of the things out of the box that for Haskell are still only language extensions or even just future proposals (dependent types, BlockArguments, do notation with ( foo) expressions and Unicode), I m still around, hosting the Haskell Interlude Podcast, writing on this blog and hanging out at ZuriHac etc.

22 January 2024

Dirk Eddelbuettel: RProtoBuf 0.4.22 on CRAN: Updated Windows Support!

A new maintenance release 0.4.22 of RProtoBuf arrived on CRAN earlier today. RProtoBuf provides R with bindings for the Google Protocol Buffers ( ProtoBuf ) data encoding and serialization library used and released by Google, and deployed very widely in numerous projects as a language and operating-system agnostic protocol. This release matches the recent 0.4.21 release which enabled use of the package with newer ProtoBuf releases. Tomas has been updating the Windows / rtools side of things, and supplied us with simple PR that will enable building with those updated versions once finalised. The following section from the NEWS.Rd file has full details.

Changes in RProtoBuf version 0.4.22 (2022-12-13)
  • Apply patch by Tomas Kalibera to support updated rtools to build with newer ProtoBuf releases on windows

Thanks to my CRANberries, there is a diff to the previous release. The RProtoBuf page has copies of the (older) package vignette, the quick overview vignette, and the pre-print of our JSS paper. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

17 January 2024

Colin Watson: Task management

Now that I m freelancing, I need to actually track my time, which is something I ve had the luxury of not having to do before. That meant something of a rethink of the way I ve been keeping track of my to-do list. Up to now that was a combination of things like the bug lists for the projects I m working on at the moment, whatever task tracking system Canonical was using at the moment (Jira when I left), and a giant flat text file in which I recorded logbook-style notes of what I d done each day plus a few extra notes at the bottom to remind myself of particularly urgent tasks. I could have started manually adding times to each logbook entry, but ugh, let s not. In general, I had the following goals (which were a bit reminiscent of my address book): I didn t do an elaborate evaluation of multiple options, because I m not trying to come up with the best possible solution for a client here. Also, there are a bazillion to-do list trackers out there and if I tried to evaluate them all I d never do anything else. I just wanted something that works well enough for me. Since it came up on Mastodon: a bunch of people swear by Org mode, which I know can do at least some of this sort of thing. However, I don t use Emacs and don t plan to use Emacs. nvim-orgmode does have some support for time tracking, but when I ve tried vim-based versions of Org mode in the past I ve found they haven t really fitted my brain very well. Taskwarrior and Timewarrior One of the other Freexian collaborators mentioned Taskwarrior and Timewarrior, so I had a look at those. The basic idea of Taskwarrior is that you have a task command that tracks each task as a blob of JSON and provides subcommands to let you add, modify, and remove tasks with a minimum of friction. task add adds a task, and you can add metadata like project:Personal (I always make sure every task has a project, for ease of filtering). Just running task shows you a task list sorted by Taskwarrior s idea of urgency, with an ID for each task, and there are various other reports with different filtering and verbosity. task <id> annotate lets you attach more information to a task. task <id> done marks it as done. So far so good, so a redacted version of my to-do list looks like this:
$ task ls
ID A Project     Tags                 Description
17   Freexian                         Add Incus support to autopkgtest [2]
 7   Columbiform                      Figure out Lloyds online banking [1]
 2   Debian                           Fix troffcvt for groff 1.23.0 [1]
11   Personal                         Replace living room curtain rail
Once I got comfortable with it, this was already a big improvement. I haven t bothered to learn all the filtering gadgets yet, but it was easy enough to see that I could do something like task all project:Personal and it d show me both pending and completed tasks in that project, and that all the data was stored in ~/.task - though I have to say that there are enough reporting bells and whistles that I haven t needed to poke around manually. In combination with the regular backups that I do anyway (you do too, right?), this gave me enough confidence to abandon my previous text-file logbook approach. Next was time tracking. Timewarrior integrates with Taskwarrior, albeit in an only semi-packaged way, and it was easy enough to set that up. Now I can do:
$ task 25 start
Starting task 00a9516f 'Write blog post about task tracking'.
Started 1 task.
Note: '"Write blog post about task tracking"' is a new tag.
Tracking Columbiform "Write blog post about task tracking"
  Started 2024-01-10T11:28:38
  Current                  38
  Total               0:00:00
You have more urgent tasks.
Project 'Columbiform' is 25% complete (3 of 4 tasks remaining).
When I stop work on something, I do task active to find the ID, then task <id> stop. Timewarrior does the tedious stopwatch business for me, and I can manually enter times if I forget to start/stop a task. Then the really useful bit: I can do something like timew summary :month <name-of-client> and it tells me how much to bill that client for this month. Perfect. I also started using VIT to simplify the day-to-day flow a little, which means I m normally just using one or two keystrokes rather than typing longer commands. That isn t really necessary from my point of view, but it does save some time. Android integration I left Android integration for a bit later since it wasn t essential. When I got round to it, I have to say that it felt a bit clumsy, but it did eventually work. The first step was to set up a taskserver. Most of the setup procedure was OK, but I wanted to use Let s Encrypt to minimize the amount of messing around with CAs I had to do. Getting this to work involved hitting things with sticks a bit, and there s still a local CA involved for client certificates. What I ended up with was a certbot setup with the webroot authenticator and a custom deploy hook as follows (with cert_name replaced by a DNS name in my house domain):
#! /bin/sh
set -eu
cert_name=taskd.example.org
found=false
for domain in $RENEWED_DOMAINS; do
    case "$domain" in
        $cert_name)
            found=:
            ;;
    esac
done
$found   exit 0
install -m 644 "/etc/letsencrypt/live/$cert_name/fullchain.pem" \
    /var/lib/taskd/pki/fullchain.pem
install -m 640 -g Debian-taskd "/etc/letsencrypt/live/$cert_name/privkey.pem" \
    /var/lib/taskd/pki/privkey.pem
systemctl restart taskd.service
I could then set this in /etc/taskd/config (server.crl.pem and ca.cert.pem were generated using the documented taskserver setup procedure):
server.key=/var/lib/taskd/pki/privkey.pem
server.cert=/var/lib/taskd/pki/fullchain.pem
server.crl=/var/lib/taskd/pki/server.crl.pem
ca.cert=/var/lib/taskd/pki/ca.cert.pem
Then I could set taskd.ca on my laptop to /usr/share/ca-certificates/mozilla/ISRG_Root_X1.crt and otherwise follow the client setup instructions, run task sync init to get things started, and then task sync every so often to sync changes between my laptop and the taskserver. I used TaskWarrior Mobile as the client. I have to say I wouldn t want to use that client as my primary task tracking interface: the setup procedure is clunky even beyond the necessity of copying a client certificate around, it expects you to give it a .taskrc rather than having a proper settings interface for that, and it only seems to let you add a task if you specify a due date for it. It also lacks Timewarrior integration, so I can only really use it when I don t care about time tracking, e.g. personal tasks. But that s really all I need, so it meets my minimum requirements. Next? Considering this is literally the first thing I tried, I have to say I m pretty happy with it. There are a bunch of optional extras I haven t tried yet, but in general it kind of has the vim nature for me: if I need something it s very likely to exist or easy enough to build, but the features I don t use don t get in my way. I wouldn t recommend any of this to somebody who didn t already spend most of their time in a terminal - but I do. I m glad people have gone to all the effort to build this so I didn t have to.

12 January 2024

Freexian Collaborators: Monthly report about Debian Long Term Support, December 2023 (by Roberto C. S nchez)

Like each month, have a look at the work funded by Freexian s Debian LTS offering.

Debian LTS contributors In December, 18 contributors have been paid to work on Debian LTS, their reports are available:
  • Abhijith PA did 7.0h (out of 7.0h assigned and 7.0h from previous period), thus carrying over 7.0h to the next month.
  • Adrian Bunk did 16.0h (out of 26.25h assigned and 8.75h from previous period), thus carrying over 19.0h to the next month.
  • Bastien Roucari s did 16.0h (out of 16.0h assigned and 4.0h from previous period), thus carrying over 4.0h to the next month.
  • Ben Hutchings did 8.0h (out of 7.25h assigned and 16.75h from previous period), thus carrying over 16.0h to the next month.
  • Chris Lamb did 18.0h (out of 18.0h assigned).
  • Emilio Pozuelo Monfort did 8.0h (out of 26.75h assigned and 8.25h from previous period), thus carrying over 27.0h to the next month.
  • Guilhem Moulin did 25.0h (out of 18.0h assigned and 7.0h from previous period).
  • Holger Levsen did 5.5h (out of 5.5h assigned).
  • Jochen Sprickerhof did 0.0h (out of 0h assigned and 10.0h from previous period), thus carrying over 10.0h to the next month.
  • Lee Garrett did 0.0h (out of 25.75h assigned and 9.25h from previous period), thus carrying over 35.0h to the next month.
  • Markus Koschany did 35.0h (out of 35.0h assigned).
  • Roberto C. S nchez did 9.5h (out of 5.5h assigned and 6.5h from previous period), thus carrying over 2.5h to the next month.
  • Santiago Ruano Rinc n did 8.255h (out of 3.26h assigned and 12.745h from previous period), thus carrying over 7.75h to the next month.
  • Sean Whitton did 4.25h (out of 3.25h assigned and 6.75h from previous period), thus carrying over 5.75h to the next month.
  • Sylvain Beucler did 16.5h (out of 21.25h assigned and 13.75h from previous period), thus carrying over 18.5h to the next month.
  • Thorsten Alteholz did 14.0h (out of 14.0h assigned).
  • Tobias Frost did 10.25h (out of 12.0h assigned), thus carrying over 1.75h to the next month.
  • Utkarsh Gupta did 18.75h (out of 11.25h assigned and 13.5h from previous period), thus carrying over 6.0h to the next month.

Evolution of the situation In December, we have released 29 DLAs. A particularly notable update in December was prepared by LTS contributor Santiago Ruano Rinc n for the openssh package. The updated produced DLA-3694-1 and included a fix for the Terrapin Attack (CVE-2023-48795), which was a rather serious flaw in the SSH protocol itself. The package bluez was the subject of another notable update by LTS contributor Chris Lamb, which resulted in DLA-3689-1 to address an insecure default configuration which allowed attackers to inject keyboard commands over Bluetooth without first authenticating. The LTS team continues its efforts to have a positive impact beyond the boundaries of LTS. Several contributors worked on packages, preparing LTS updates, but also preparing patches or full updates which were uploaded to the unstable, stable, and oldstable distributions, including: Guilhem Moulin s update of tinyxml (uploads to LTS and unstable and patches submitted to the security team for stable and oldstable); Guilhem Moulin s update of xerces-c (uploads to LTS and unstable and patches submitted to the security team for oldstable); Thorsten Alteholz s update of libde265 (uploads to LTS and stable and additional patches submitted to the maintainer for stable and oldstable); Thorsten Alteholz s update of cjson (upload to LTS and patches submitted to the maintainer for stable and oldstable); and Tobias Frost s update of opendkim (sponsor maintainer-prepared upload to LTS and additionally prepared updates for stable and oldstable). Going beyond Debian and looking to the broader community, LTS contributor Bastien Roucari s was contacted by SUSE concerning an update he had prepared for zbar. He was able to assist by coordinating with the former organization of the original zbar author to secure for SUSE access to information concerning the exploits. This has enabled another distribution to benefit from the work done in support of LTS and from the assistance of Bastien in coordinating the access to information. Finally, LTS contributor Santiago Ruano Rinc n continued work relating to how updates for packages in statically-linked language ecosystems (e.g., Go, Rust, and others) are handled. The work is presently focused on more accurately and reliably identifying which packages are impacted in a given update scenario to enable notifications to be published so that users will be made aware of these situations as they occur. As the work continues, it will eventually result in improvements to Debian infrustructure so that the LTS team and Security team are able to manage updates of this nature in a more consistent way.

Thanks to our sponsors Sponsors that joined recently are in bold.

10 January 2024

Dirk Eddelbuettel: Rcpp 1.0.12 on CRAN: New Maintenance / Update Release

rcpp logo The Rcpp Core Team is once again thrilled to announce a new release 1.0.12 of the Rcpp package. It arrived on CRAN early today, and has since been uploaded to Debian as well. Windows and macOS builds should appear at CRAN in the next few days, as will builds in different Linux distribution and of course at r2u should catch up tomorrow. The release was uploaded yesterday, and run its reverse dependencies overnight. Rcpp always gets flagged nomatter what because the grandfathered .Call(symbol) but we had not single change to worse among over 2700 reverse dependencies! This release continues with the six-months January-July cycle started with release 1.0.5 in July 2020. As a reminder, we do of course make interim snapshot dev or rc releases available via the Rcpp drat repo and strongly encourage their use and testing I run my systems with these versions which tend to work just as well, and are also fully tested against all reverse-dependencies. Rcpp has long established itself as the most popular way of enhancing R with C or C++ code. Right now, 2791 packages on CRAN depend on Rcpp for making analytical code go faster and further, along with 254 in BioConductor. On CRAN, 13.8% of all packages depend (directly) on Rcpp, and 59.9% of all compiled packages do. From the cloud mirror of CRAN (which is but a subset of all CRAN downloads), Rcpp has been downloaded 78.1 million times. The two published papers (also included in the package as preprint vignettes) have, respectively, 1766 (JSS, 2011) and 292 (TAS, 2018) citations, while the the book (Springer useR!, 2013) has another 617. This release is incremental as usual, generally preserving existing capabilities faithfully while smoothing our corners and / or extending slightly, sometimes in response to changing and tightened demands from CRAN or R standards. The full list below details all changes, their respective PRs and, if applicable, issue tickets. Big thanks from all of us to all contributors!

Changes in Rcpp release version 1.0.12 (2024-01-08)
  • Changes in Rcpp API:
    • Missing header includes as spotted by some recent tools were added in two places (Michael Chirico in #1272 closing #1271).
    • Casts to avoid integer overflow in matrix row/col selections have neem added (Aaron Lun #1281).
    • Three print format correction uncovered by R-devel were applied with thanks to Tomas Kalibera (Dirk in #1285).
    • Correct a print format correction in the RcppExports glue code (Dirk in #1288 fixing #1287).
    • The upcoming OBJSXP addition to R 4.4.0 is supported in the type2name mapper (Dirk and I aki in #1293).
  • Changes in Rcpp Attributes:
    • Generated interface code from base R that fails under LTO is now corrected (I aki in #1274 fixing a StackOverflow issue).
  • Changes in Rcpp Documentation:
    • The caption for third figure in the introductory vignette has been corrected (Dirk in #1277 fixing #1276).
    • A small formatting issue was correct in an Rd file as noticed by R-devel (Dirk in #1282).
    • The Rcpp FAQ vignette has been updated (Dirk in #1284).
    • The Rcpp.bib file has been refreshed to current package versions.
  • Changes in Rcpp Deployment:
    • The RcppExports file for an included test package has been updated (Dirk in #1289).

Thanks to my CRANberries, you can also look at a diff to the previous release Questions, comments etc should go to the rcpp-devel mailing list off the R-Forge page. Bugs reports are welcome at the GitHub issue tracker as well (where one can also search among open or closed issues). If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

8 January 2024

Thorsten Alteholz: My Debian Activities in December 2023

FTP master This month I accepted 235 and rejected 13 packages. The overall number of packages that got accepted was 249. I also handled lots of RM bugs and almost stopped the increase in packages this month :-). Please be aware, if you don t want your package to be removed, take care of it and keep it in good shape! Debian LTS This was my hundred-fourteenth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded: This month was rather calm and no unexpected things happened. The web team now automatically creates all webpages from data found in the security tracker. So I could deactivate my web-dla script again which created the webpages from the contents of the announcement mailing list. Last but not least I also did two weeks of frontdesk duties. Debian ELTS This month was the sixty-fifth ELTS month. During my allocated time I uploaded: Last but not least I also did two weeks of frontdesk duties. Debian Printing This month I uploaded a package to fix bugs: This work is generously funded by Freexian! Debian Astro This month I uploaded a package to fix bugs: Other stuff This month I uploaded new upstream version of packages, did a source upload for the transition or uploaded it to fix one or the other issue:

25 December 2023

Sergio Talens-Oliag: GitLab CI/CD Tips: Automatic Versioning Using semantic-release

This post describes how I m using semantic-release on gitlab-ci to manage versioning automatically for different kinds of projects following a simple workflow (a develop branch where changes are added or merged to test new versions, a temporary release/#.#.# to generate the release candidate versions and a main branch where the final versions are published).

What is semantic-releaseIt is a Node.js application designed to manage project versioning information on Git Repositories using a Continuous integration system (in this post we will use gitlab-ci)

How does it workBy default semantic-release uses semver for versioning (release versions use the format MAJOR.MINOR.PATCH) and commit messages are parsed to determine the next version number to publish. If after analyzing the commits the version number has to be changed, the command updates the files we tell it to (i.e. the package.json file for nodejs projects and possibly a CHANGELOG.md file), creates a new commit with the changed files, creates a tag with the new version and pushes the changes to the repository. When running on a CI/CD system we usually generate the artifacts related to a release (a package, a container image, etc.) from the tag, as it includes the right version number and usually has passed all the required tests (it is a good idea to run the tests again in any case, as someone could create a tag manually or we could run extra jobs when building the final assets if they fail it is not a big issue anyway, numbers are cheap and infinite, so we can skip releases if needed).

Commit messages and versioningThe commit messages must follow a known format, the default module used to analyze them uses the angular git commit guidelines, but I prefer the conventional commits one, mainly because it s a lot easier to use when you want to update the MAJOR version. The commit message format used must be:
<type>(optional scope): <description>
[optional body]
[optional footer(s)]
The system supports three types of branches: release, maintenance and pre-release, but for now I m not using maintenance ones. The branches I use and their types are:
  • main as release branch (final versions are published from there)
  • develop as pre release branch (used to publish development and testing versions with the format #.#.#-SNAPSHOT.#)
  • release/#.#.# as pre release branches (they are created from develop to publish release candidate versions with the format #.#.#-rc.# and once they are merged with main they are deleted)
On the release branch (main) the version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last version change (it looks for tags on the same branch).
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last version change.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.
On the pre release branches (develop and release/#.#.#) the version and pre release numbers are always calculated from the last published version available on the branch (i. e. if we published version 1.3.2 on main we need to have the commit with that tag on the develop or release/#.#.# branch to get right what will be the next version). The version number is updated as follows:
  1. The MAJOR number is incremented if a commit with a BREAKING CHANGE: footer or an exclamation (!) after the type/scope is found in the list of commits found since the last released version.In our example it was 1.3.2 and the version is updated to 2.0.0-SNAPSHOT.1 or 2.0.0-rc.1 depending on the branch.
  2. The MINOR number is incremented if the MAJOR number is not going to be changed and there is a commit with type feat in the commits found since the last released version.In our example the release was 1.3.2 and the version is updated to 1.4.0-SNAPSHOT.1 or 1.4.0-rc.1 depending on the branch.
  3. The PATCH number is incremented if neither the MAJOR nor the MINOR numbers are going to be changed and there is a commit with type fix in the the commits found since the last version change.In our example the release was 1.3.2 and the version is updated to 1.3.3-SNAPSHOT.1 or 1.3.3-rc.1 depending on the branch.
  4. The pre release number is incremented if the MAJOR, MINOR and PATCH numbers are not going to be changed but there is a commit that would otherwise update the version (i.e. a fix on 1.3.3-SNAPSHOT.1 will set the version to 1.3.3-SNAPSHOT.2, a fix or feat on 1.4.0-rc.1 will set the version to 1.4.0-rc.2 an so on).

How do we manage its configurationAlthough the system is designed to work with nodejs projects, it can be used with multiple programming languages and project types. For nodejs projects the usual place to put the configuration is the project s package.json, but I prefer to use the .releaserc file instead. As I use a common set of CI templates, instead of using a .releaserc on each project I generate it on the fly on the jobs that need it, replacing values related to the project type and the current branch on a template using the tmpl command (lately I use a branch of my own fork while I wait for some feedback from upstream, as you will see on the Dockerfile).

Container used to run itAs we run the command on a gitlab-ci job we use the image built from the following Dockerfile:
Dockerfile
# Semantic release image
FROM golang:alpine AS tmpl-builder
#RUN go install github.com/krakozaure/tmpl@v0.4.0
RUN go install github.com/sto/tmpl@v0.4.0-sto.2
FROM node:lts-alpine
COPY --from=tmpl-builder /go/bin/tmpl /usr/local/bin/tmpl
RUN apk update &&\
  apk upgrade &&\
  apk add curl git jq openssh-keygen yq zip &&\
  npm install --location=global\
    conventional-changelog-conventionalcommits@6.1.0\
    @qiwi/multi-semantic-release@7.0.0\
    semantic-release@21.0.7\
    @semantic-release/changelog@6.0.3\
    semantic-release-export-data@1.0.1\
    @semantic-release/git@10.0.1\
    @semantic-release/gitlab@9.5.1\
    @semantic-release/release-notes-generator@11.0.4\
    semantic-release-replace-plugin@1.2.7\
    semver@7.5.4\
  &&\
  rm -rf /var/cache/apk/*
CMD ["/bin/sh"]

How and when is it executedThe job that runs semantic-release is executed when new commits are added to the develop, release/#.#.# or main branches (basically when something is merged or pushed) and after all tests have passed (we don t want to create a new version that does not compile or passes at least the unit tests). The job is something like the following:
semantic_release:
  image: $SEMANTIC_RELEASE_IMAGE
  rules:
    - if: '$CI_COMMIT_BRANCH =~ /^(develop main release\/\d+.\d+.\d+)$/'
      when: always
  stage: release
  before_script:
    - echo "Loading scripts.sh"
    - . $ASSETS_DIR/scripts.sh
  script:
    - sr_gen_releaserc_json
    - git_push_setup
    - semantic-release
Where the SEMANTIC_RELEASE_IMAGE variable contains the URI of the image built using the Dockerfile above and the sr_gen_releaserc_json and git_push_setup are functions defined on the $ASSETS_DIR/scripts.sh file:
  • The sr_gen_releaserc_json function generates the .releaserc.json file using the tmpl command.
  • The git_push_setup function configures git to allow pushing changes to the repository with the semantic-release command, optionally signing them with a SSH key.

The sr_gen_releaserc_json functionThe code for the sr_gen_releaserc_json function is the following:
sr_gen_releaserc_json()
 
  # Use nodejs as default project_type
  project_type="$ PROJECT_TYPE:-nodejs "
  # REGEX to match the rc_branch name
  rc_branch_regex='^release\/[0-9]\+\.[0-9]\+\.[0-9]\+$'
  # PATHS on the local ASSETS_DIR
  assets_dir="$ CI_PROJECT_DIR /$ ASSETS_DIR "
  sr_local_plugin="$ assets_dir /local-plugin.cjs"
  releaserc_tmpl="$ assets_dir /releaserc.json.tmpl"
  pipeline_runtime_values_yaml="/tmp/releaserc_values.yaml"
  pipeline_values_yaml="$ assets_dir /values_$ project_type _project.yaml"
  # Destination PATH
  releaserc_json=".releaserc.json"
  # Create an empty pipeline_values_yaml if missing
  test -f "$pipeline_values_yaml"   : >"$pipeline_values_yaml"
  # Create the pipeline_runtime_values_yaml file
  echo "branch: $ CI_COMMIT_BRANCH " >"$pipeline_runtime_values_yaml"
  echo "gitlab_url: $ CI_SERVER_URL " >"$pipeline_runtime_values_yaml"
  # Add the rc_branch name if we are on an rc_branch
  if [ "$(echo "$CI_COMMIT_BRANCH"   sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_COMMIT_BRANCH " >>"$pipeline_runtime_values_yaml"
  elif [ "$(echo "$CI_MERGE_REQUEST_SOURCE_BRANCH_NAME"  
      sed -ne "/$rc_branch_regex/ p ")" ]; then
    echo "rc_branch: $ CI_MERGE_REQUEST_SOURCE_BRANCH_NAME " \
      >>"$pipeline_runtime_values_yaml"
  fi
  echo "sr_local_plugin: $ sr_local_plugin " >>"$pipeline_runtime_values_yaml"
  # Create the releaserc_json file
  tmpl -f "$pipeline_runtime_values_yaml" -f "$pipeline_values_yaml" \
    "$releaserc_tmpl"   jq . >"$releaserc_json"
  # Remove the pipeline_runtime_values_yaml file
  rm -f "$pipeline_runtime_values_yaml"
  # Print the releaserc_json file
  print_file_collapsed "$releaserc_json"
  # --*-- BEG: NOTE --*--
  # Rename the package.json to ignore it when calling semantic release.
  # The idea is that the local-plugin renames it back on the first step of the
  # semantic-release process.
  # --*-- END: NOTE --*--
  if [ -f "package.json" ]; then
    echo "Renaming 'package.json' to 'package.json_disabled'"
    mv "package.json" "package.json_disabled"
  fi
 
Almost all the variables used on the function are defined by gitlab except the ASSETS_DIR and PROJECT_TYPE; in the complete pipelines the ASSETS_DIR is defined on a common file included by all the pipelines and the project type is defined on the .gitlab-ci.yml file of each project. If you review the code you will see that the file processed by the tmpl command is named releaserc.json.tmpl, its contents are shown here:
 
  "plugins": [
     - if .sr_local_plugin  
    "  .sr_local_plugin  ",
     - end  
    [
      "@semantic-release/commit-analyzer",
       
        "preset": "conventionalcommits",
        "releaseRules": [
            "breaking": true, "release": "major"  ,
            "revert": true, "release": "patch"  ,
            "type": "feat", "release": "minor"  ,
            "type": "fix", "release": "patch"  ,
            "type": "perf", "release": "patch"  
        ]
       
    ],
     - if .replacements  
    [
      "semantic-release-replace-plugin",
        "replacements":   .replacements   toJson    
    ],
     - end  
    "@semantic-release/release-notes-generator",
     - if eq .branch "main"  
    [
      "@semantic-release/changelog",
        "changelogFile": "CHANGELOG.md", "changelogTitle": "# Changelog"  
    ],
     - end  
    [
      "@semantic-release/git",
       
        "assets":   if .assets   .assets   toJson   else  []  end  ,
        "message": "ci(release): v$ nextRelease.version \n\n$ nextRelease.notes "
       
    ],
    [
      "@semantic-release/gitlab",
        "gitlabUrl": "  .gitlab_url  ", "successComment": false  
    ]
  ],
  "branches": [
      "name": "develop", "prerelease": "SNAPSHOT"  ,
     - if .rc_branch  
      "name": "  .rc_branch  ", "prerelease": "rc"  ,
     - end  
    "main"
  ]
 
The values used to process the template are defined on a file built on the fly (releaserc_values.yaml) that includes the following keys and values:
  • branch: the name of the current branch
  • gitlab_url: the URL of the gitlab server (the value is taken from the CI_SERVER_URL variable)
  • rc_branch: the name of the current rc branch; we only set the value if we are processing one because semantic-release only allows one branch to match the rc prefix and if we use a wildcard (i.e. release/*) but the users keep more than one release/#.#.# branch open at the same time the calls to semantic-release will fail for sure.
  • sr_local_plugin: the path to the local plugin we use (shown later)
The template also uses a values_$ project_type _project.yaml file that includes settings specific to the project type, the one for nodejs is as follows:
replacements:
  - files:
      - "package.json"
    from: "\"version\": \".*\""
    to: "\"version\": \"$ nextRelease.version \""
assets:
  - "CHANGELOG.md"
  - "package.json"
The replacements section is used to update the version field on the relevant files of the project (in our case the package.json file) and the assets section includes the files that will be committed to the repository when the release is published (looking at the template you can see that the CHANGELOG.md is only updated for the main branch, we do it this way because if we update the file on other branches it creates a merge nightmare and we are only interested on it for released versions anyway). The local plugin adds code to rename the package.json_disabled file to package.json if present and prints the last and next versions on the logs with a format that can be easily parsed using sed:
local-plugin.cjs
// Minimal plugin to:
// - rename the package.json_disabled file to package.json if present
// - log the semantic-release last & next versions
function verifyConditions(pluginConfig, context)  
  var fs = require('fs');
  if (fs.existsSync('package.json_disabled'))  
    fs.renameSync('package.json_disabled', 'package.json');
    context.logger.log( verifyConditions: renamed 'package.json_disabled' to 'package.json' );
   
 
function analyzeCommits(pluginConfig, context)  
  if (context.lastRelease && context.lastRelease.version)  
    context.logger.log( analyzeCommits: LAST_VERSION=$ context.lastRelease.version  );
   
 
function verifyRelease(pluginConfig, context)  
  if (context.nextRelease && context.nextRelease.version)  
    context.logger.log( verifyRelease: NEXT_VERSION=$ context.nextRelease.version  );
   
 
module.exports =  
  verifyConditions,
  analyzeCommits,
  verifyRelease
 

The git_push_setup functionThe code for the git_push_setup function is the following:
git_push_setup()
 
  # Update global credentials to allow git clone & push for all the group repos
  git config --global credential.helper store
  cat >"$HOME/.git-credentials" <<EOF
https://fake-user:$ GITLAB_REPOSITORY_TOKEN @gitlab.com
EOF
  # Define user name, mail and signing key for semantic-release
  user_name="$SR_USER_NAME"
  user_email="$SR_USER_EMAIL"
  ssh_signing_key="$SSH_SIGNING_KEY"
  # Export git user variables
  export GIT_AUTHOR_NAME="$user_name"
  export GIT_AUTHOR_EMAIL="$user_email"
  export GIT_COMMITTER_NAME="$user_name"
  export GIT_COMMITTER_EMAIL="$user_email"
  # Sign commits with ssh if there is a SSH_SIGNING_KEY variable
  if [ "$ssh_signing_key" ]; then
    echo "Configuring GIT to sign commits with SSH"
    ssh_keyfile="/tmp/.ssh-id"
    : >"$ssh_keyfile"
    chmod 0400 "$ssh_keyfile"
    echo "$ssh_signing_key"   tr -d '\r' >"$ssh_keyfile"
    git config gpg.format ssh
    git config user.signingkey "$ssh_keyfile"
    git config commit.gpgsign true
  fi
 
The function assumes that the GITLAB_REPOSITORY_TOKEN variable (set on the CI/CD variables section of the project or group we want) contains a token with read_repository and write_repository permissions on all the projects we are going to use this function. The SR_USER_NAME and SR_USER_EMAIL variables can be defined on a common file or the CI/CD variables section of the project or group we want to work with and the script assumes that the optional SSH_SIGNING_KEY is exported as a CI/CD default value of type variable (that is why the keyfile is created on the fly) and git is configured to use it if the variable is not empty.
Warning: Keep in mind that the variables GITLAB_REPOSITORY_TOKEN and SSH_SIGNING_KEY contain secrets, so probably is a good idea to make them protected (if you do that you have to make the develop, main and release/* branches protected too).
Warning: The semantic-release user has to be able to push to all the projects on those protected branches, it is a good idea to create a dedicated user and add it as a MAINTAINER for the projects we want (the MAINTAINERS need to be able to push to the branches), or, if you are using a Gitlab with a Premium license you can use the api to allow the semantic-release user to push to the protected branches without allowing it for any other user.

The semantic-release commandOnce we have the .releaserc file and the git configuration ready we run the semantic-release command. If the branch we are working with has one or more commits that will increment the version, the tool does the following (note that the steps are described are the ones executed if we use the configuration we have generated):
  1. It detects the commits that will increment the version and calculates the next version number.
  2. Generates the release notes for the version.
  3. Applies the replacements defined on the configuration (in our example updates the version field on the package.json file).
  4. Updates the CHANGELOG.md file adding the release notes if we are going to publish the file (when we are on the main branch).
  5. Creates a commit if all or some of the files listed on the assets key have changed and uses the commit message we have defined, replacing the variables for their current values.
  6. Creates a tag with the new version number and the release notes.
  7. As we are using the gitlab plugin after tagging it also creates a release on the project with the tag name and the release notes.

Notes about the git workflows and merges between branchesIt is very important to remember that semantic-release looks at the commits of a given branch when calculating the next version to publish, that has two important implications:
  1. On pre release branches we need to have the commit that includes the tag with the released version, if we don t have it the next version is not calculated correctly.
  2. It is a bad idea to squash commits when merging a branch to another one, if we do that we will lose the information semantic-release needs to calculate the next version and even if we use the right prefix for the squashed commit (fix, feat, ) we miss all the messages that would otherwise go to the CHANGELOG.md file.
To make sure that we have the right commits on the pre release branches we should merge the main branch changes into the develop one after each release tag is created; in my pipelines the fist job that processes a release tag creates a branch from the tag and an MR to merge it to develop. The important thing about that MR is that is must not be squashed, if we do that the tag commit will probably be lost, so we need to be careful. To merge the changes directly we can run the following code:
# Set the SR_TAG variable to the tag you want to process
SR_TAG="v1.3.2"
# Fetch all the changes
git fetch --all --prune
# Switch to the main branch
git switch main
# Pull all the changes
git pull
# Switch to the development branch
git switch develop
# Pull all the changes
git pull
# Create followup branch from tag
git switch -c "followup/$SR_TAG" "$SR_TAG"
# Change files manually & commit the changed files
git commit -a --untracked-files=no -m "ci(followup): $SR_TAG to develop"
# Switch to the development branch
git switch develop
# Merge the followup branch into the development one using the --no-ff option
git merge --no-ff "followup/$SR_TAG"
# Remove the followup branch
git branch -d "followup/$SR_TAG"
# Push the changes
git push
If we can t push directly to develop we can create a MR pushing the followup branch after committing the changes, but we have to make sure that we don t squash the commits when merging or it will not work as we want.

19 December 2023

Dirk Eddelbuettel: tinythemes 0.0.1 at CRAN: New Package

Delighted to announce a new package that arrived on CRAN today: tinythemes. It repackages the theme_ipsum_rc() function by Bob Rudis from his hrbrthemes package in a zero (added) dependency way. A simple example is (also available as a demo inside the packages in the next update) contrasts the default style (on left) with the one added by this package (on the right): The GitHub repo also shows this little example: total dependencies of hrbrthemes over what ggplot2 installs:
> db <- tools::CRAN_package_db()
> deps <- tools::package_dependencies(c("ggplot2", "hrbrthemes"), recursive=TRUE, db=db
> Filter(\(x) x != "ggplot2", setdiff(deps[[2]], deps[[1]]))
 [1] "extrafont"         "knitr"             "rmarkdown"         "htmltools"        
 [5] "tools"             "gdtools"           "extrafontdb"       "Rttf2pt1"         
 [9] "Rcpp"              "systemfonts"       "gfonts"            "curl"             
[13] "fontquiver"        "base64enc"         "digest"            "ellipsis"         
[17] "fastmap"           "evaluate"          "highr"             "xfun"             
[21] "yaml"              "bslib"             "fontawesome"       "jquerylib"        
[25] "jsonlite"          "stringr"           "tinytex"           "cachem"           
[29] "memoise"           "mime"              "sass"              "fontBitstreamVera"
[33] "fontLiberation"    "shiny"             "crul"              "crayon"           
[37] "stringi"           "cpp11"             "urltools"          "httpcode"         
[41] "fs"                "rappdirs"          "httpuv"            "xtable"           
[45] "sourcetools"       "later"             "promises"          "commonmark"       
[49] "triebeard"        
>
Comments and suggestions are welcome at the GitHub repo. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

14 December 2023

Dirk Eddelbuettel: RProtoBuf 0.4.21 on CRAN: Updated Upstream Support!

An exciting new release 0.4.21 of RProtoBuf arrived on CRAN earlier today. RProtoBuf provides R with bindings for the Google Protocol Buffers ( ProtoBuf ) data encoding and serialization library used and released by Google, and deployed very widely in numerous projects as a language and operating-system agnostic protocol. ProtoBuf development, following what seemed like a multi-year lull, all of a sudden picked up again with a vengeance a little while ago. And the library releases we rely on for convenience and provided by the Linux distributions are lagging. So last summer we received an excellent, and focussed, pull request #93 offering to update the package to the newer ProtoBuf 22.0 and beyond. (Aside: When a library ditches its numbering scheme you know changes are for real . My Ubuntu 23.10 box is still at 3.21 in a different counting scheme .) But it wasn t until last weekend the issue ticket #95 by Sebastian ran into the same issue, but recognized it and contained a container recipe! So now all of a sudden we were able to build under a newer ProtoBuf which made accepting the PR #93 much easier! We added this as an additional continuous unit test, and made a few other smaller updates to documentation and style. The following section from the NEWS.Rd file has full details.

Changes in RProtoBuf version 0.4.21 (2022-12-13)
  • Package now builds with ProtoBuf >= 22.x thanks to Matteo Gianella (#93 addressing #92).
  • An Alpine 3.19-based workflow was added to test this in continuous integration thanks to a suggestion by Sebastian Meyer.
  • A large number of old-style .Call were updated (#96).
  • Several packaging, dcoumentation and testing items were updated.

Thanks to my CRANberries, there is a diff to the previous release. The RProtoBuf page has copies of the (older) package vignette, the quick overview vignette, and the pre-print of our JSS paper. Questions, comments etc should go to the GitHub issue tracker off the GitHub repo. If you like this or other open-source work I do, you can sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

7 December 2023

Daniel Kahn Gillmor: New OpenPGP certificate for dkg, December 2023

dkg's New OpenPGP certificate in December 2023 In December of 2023, I'm moving to a new OpenPGP certificate. You might know my old OpenPGP certificate, which had an fingerprint of C29F8A0C01F35E34D816AA5CE092EB3A5CA10DBA. My new OpenPGP certificate has a fingerprint of: D477040C70C2156A5C298549BB7E9101495E6BF7. Both certificates have the same set of User IDs:
  • Daniel Kahn Gillmor
  • <dkg@debian.org>
  • <dkg@fifthhorseman.net>
You can find a version of this transition statement signed by both the old and new certificates at: https://dkg.fifthhorseman.net/2023-dkg-openpgp-transition.txt The new OpenPGP certificate is:
-----BEGIN PGP PUBLIC KEY BLOCK-----
xjMEZXEJyxYJKwYBBAHaRw8BAQdA5BpbW0bpl5qCng/RiqwhQINrplDMSS5JsO/Y
O+5Zi7HCwAsEHxYKAH0FgmVxCcsDCwkHCRC7fpEBSV5r90cUAAAAAAAeACBzYWx0
QG5vdGF0aW9ucy5zZXF1b2lhLXBncC5vcmfUAgfN9tyTSxpxhmHA1r63GiI4v6NQ
mrrWVLOBRJYuhQMVCggCmwECHgEWIQTUdwQMcMIValwphUm7fpEBSV5r9wAAmaEA
/3MvYJMxQdLhIG4UDNMVd2bsovwdcTrReJhLYyFulBrwAQD/j/RS+AXQIVtkcO9b
l6zZTAO9x6yfkOZbv0g3eNyrAs0QPGRrZ0BkZWJpYW4ub3JnPsLACwQTFgoAfQWC
ZXEJywMLCQcJELt+kQFJXmv3RxQAAAAAAB4AIHNhbHRAbm90YXRpb25zLnNlcXVv
aWEtcGdwLm9yZ4l+Z3i19Uwjw3CfTNFCDjRsoufMoPOM7vM8HoOEdn/vAxUKCAKb
AQIeARYhBNR3BAxwwhVqXCmFSbt+kQFJXmv3AAALZQEAhJsgouepQVV98BHUH6Sv
WvcKrb8dQEZOvHFbZQQPNWgA/A/DHkjYKnUkCg8Zc+FonqOS/35sHhNA8CwqSQFr
tN4KzRc8ZGtnQGZpZnRoaG9yc2VtYW4ubmV0PsLACgQTFgoAfQWCZXEJywMLCQcJ
ELt+kQFJXmv3RxQAAAAAAB4AIHNhbHRAbm90YXRpb25zLnNlcXVvaWEtcGdwLm9y
ZxLvwkgnslsAuo+IoSa9rv8+nXpbBdab2Ft7n4H9S+d/AxUKCAKbAQIeARYhBNR3
BAxwwhVqXCmFSbt+kQFJXmv3AAAtFgD4wqcUfQl7nGLQOcAEHhx8V0Bg8v9ov8Gs
Y1ei1BEFwAD/cxmxmDSO0/tA+x4pd5yIvzgfGYHSTxKS0Ww3hzjuZA7NE0Rhbmll
bCBLYWhuIEdpbGxtb3LCwA4EExYKAIAFgmVxCcsDCwkHCRC7fpEBSV5r90cUAAAA
AAAeACBzYWx0QG5vdGF0aW9ucy5zZXF1b2lhLXBncC5vcmd7X4TgiINwnzh4jar0
Pf/b5hgxFPngCFxJSmtr/f0YiQMVCggCmQECmwECHgEWIQTUdwQMcMIValwphUm7
fpEBSV5r9wAAMuwBAPtMonKbhGOhOy+8miAb/knJ1cIPBjLupJbjM+NUE1WyAQD1
nyGW+XwwMrprMwc320mdJH9B0jdokJZBiN7++0NoBM4zBGVxCcsWCSsGAQQB2kcP
AQEHQI19uRatkPSFBXh8usgciEDwZxTnnRZYrhIgiFMybBDQwsC/BBgWCgExBYJl
cQnLCRC7fpEBSV5r90cUAAAAAAAeACBzYWx0QG5vdGF0aW9ucy5zZXF1b2lhLXBn
cC5vcmfCopazDnq6hZUsgVyztl5wmDCmxI169YLNu+IpDzJEtQKbAr6gBBkWCgBv
BYJlcQnLCRB3LRYeNc1LgUcUAAAAAAAeACBzYWx0QG5vdGF0aW9ucy5zZXF1b2lh
LXBncC5vcmcQglI7G7DbL9QmaDkzcEuk3QliM4NmleIRUW7VvIBHMxYhBHS8BMQ9
hghL6GcsBnctFh41zUuBAACwfwEAqDULksr8PulKRcIP6N9NI/4KoznyIcuOHi8q
Gk4qxMkBAIeV20SPEnWSw9MWAb0eKEcfupzr/C+8vDvsRMynCWsDFiEE1HcEDHDC
FWpcKYVJu36RAUlea/cAAFD1AP0YsE3Eeig1tkWaeyrvvMf5Kl1tt2LekTNWDnB+
FUG9SgD+Ka8vfPR8wuV8D3y5Y9Qq9xGO+QkEBCW0U1qNypg65QHOOARlcQnLEgor
BgEEAZdVAQUBAQdAWTLEa0WmnhUmDBdWXX0ZlYAa4g1CK/fXg0NPOQSteA4DAQgH
wsAABBgWCgByBYJlcQnLCRC7fpEBSV5r90cUAAAAAAAeACBzYWx0QG5vdGF0aW9u
cy5zZXF1b2lhLXBncC5vcmexrMBZe0QdQ+ZJOZxFkAiwCw2I7yTSF2Ox9GVFWKmA
mAKbDBYhBNR3BAxwwhVqXCmFSbt+kQFJXmv3AABcJQD/f4ltpSvLBOBEh/C2dIYa
dgSuqkCqq0B4WOhFRkWJZlcA/AxqLWG4o8UrrmwrmM42FhgxKtEXwCSHE00u8wR4
Up8G
=9Yc8
-----END PGP PUBLIC KEY BLOCK-----
When I have some reasonable number of certifications, i'll update the certificate associated with my e-mail addresses on https://keys.openpgp.org, in DANE, and in WKD. Until then, those lookups should continue to provide the old certificate.

6 December 2023

Reproducible Builds: Reproducible Builds in November 2023

Welcome to the November 2023 report from the Reproducible Builds project! In these reports we outline the most important things that we have been up to over the past month. As a rather rapid recap, whilst anyone may inspect the source code of free software for malicious flaws, almost all software is distributed to end users as pre-compiled binaries (more).

Reproducible Builds Summit 2023 Between October 31st and November 2nd, we held our seventh Reproducible Builds Summit in Hamburg, Germany! Amazingly, the agenda and all notes from all sessions are all online many thanks to everyone who wrote notes from the sessions. As a followup on one idea, started at the summit, Alexander Couzens and Holger Levsen started work on a cache (or tailored front-end) for the snapshot.debian.org service. The general idea is that, when rebuilding Debian, you do not actually need the whole ~140TB of data from snapshot.debian.org; rather, only a very small subset of the packages are ever used for for building. It turns out, for amd64, arm64, armhf, i386, ppc64el, riscv64 and s390 for Debian trixie, unstable and experimental, this is only around 500GB ie. less than 1%. Although the new service not yet ready for usage, it has already provided a promising outlook in this regard. More information is available on https://rebuilder-snapshot.debian.net and we hope that this service becomes usable in the coming weeks. The adjacent picture shows a sticky note authored by Jan-Benedict Glaw at the summit in Hamburg, confirming Holger Levsen s theory that rebuilding all Debian packages needs a very small subset of packages, the text states that 69,200 packages (in Debian sid) list 24,850 packages in their .buildinfo files, in 8,0200 variations. This little piece of paper was the beginning of rebuilder-snapshot and is a direct outcome of the summit! The Reproducible Builds team would like to thank our event sponsors who include Mullvad VPN, openSUSE, Debian, Software Freedom Conservancy, Allotropia and Aspiration Tech.

Beyond Trusting FOSS presentation at SeaGL On November 4th, Vagrant Cascadian presented Beyond Trusting FOSS at SeaGL in Seattle, WA in the United States. Founded in 2013, SeaGL is a free, grassroots technical summit dedicated to spreading awareness and knowledge about free source software, hardware and culture. The summary of Vagrant s talk mentions that it will:
[ ] introduce the concepts of Reproducible Builds, including best practices for developing and releasing software, the tools available to help diagnose issues, and touch on progress towards solving decades-old deeply pervasive fundamental security issues Learn how to verify and demonstrate trust, rather than simply hoping everything is OK!
Germane to the contents of the talk, the slides for Vagrant s talk can be built reproducibly, resulting in a PDF with a SHA1 of cfde2f8a0b7e6ec9b85377eeac0661d728b70f34 when built on Debian bookworm and c21fab273232c550ce822c4b0d9988e6c49aa2c3 on Debian sid at the time of writing.

Human Factors in Software Supply Chain Security Marcel Fourn , Dominik Wermke, Sascha Fahl and Yasemin Acar have published an article in a Special Issue of the IEEE s Security & Privacy magazine. Entitled A Viewpoint on Human Factors in Software Supply Chain Security: A Research Agenda, the paper justifies the need for reproducible builds to reach developers and end-users specifically, and furthermore points out some under-researched topics that we have seen mentioned in interviews. An author pre-print of the article is available in PDF form.

Community updates On our mailing list this month:

openSUSE updates Bernhard M. Wiedemann has created a wiki page outlining an proposal to create a general-purpose Linux distribution which consists of 100% bit-reproducible packages albeit minus the embedded signature within RPM files. It would be based on openSUSE Tumbleweed or, if available, its Slowroll-variant. In addition, Bernhard posted another monthly update for his work elsewhere in openSUSE.

Ubuntu Launchpad now supports .buildinfo files Back in 2017, Steve Langasek filed a bug against Ubuntu s Launchpad code hosting platform to report that .changes files (artifacts of building Ubuntu and Debian packages) reference .buildinfo files that aren t actually exposed by Launchpad itself. This was causing issues when attempting to process .changes files with tools such as Lintian. However, it was noticed last month that, in early August of this year, Simon Quigley had resolved this issue, and .buildinfo files are now available from the Launchpad system.

PHP reproducibility updates There have been two updates from the PHP programming language this month. Firstly, the widely-deployed PHPUnit framework for the PHP programming language have recently released version 10.5.0, which introduces the inclusion of a composer.lock file, ensuring total reproducibility of the shipped binary file. Further details and the discussion that went into their particular implementation can be found on the associated GitHub pull request. In addition, the presentation Leveraging Nix in the PHP ecosystem has been given in late October at the PHP International Conference in Munich by Pol Dellaiera. While the video replay is not yet available, the (reproducible) presentation slides and speaker notes are available.

diffoscope changes diffoscope is our in-depth and content-aware diff utility that can locate and diagnose reproducibility issues. This month, Chris Lamb made a number of changes, including:
  • Improving DOS/MBR extraction by adding support for 7z. [ ]
  • Adding a missing RequiredToolNotFound import. [ ]
  • As a UI/UX improvement, try and avoid printing an extended traceback if diffoscope runs out of memory. [ ]
  • Mark diffoscope as stable on PyPI.org. [ ]
  • Uploading version 252 to Debian unstable. [ ]

Website updates A huge number of notes were added to our website that were taken at our recent Reproducible Builds Summit held between October 31st and November 2nd in Hamburg, Germany. In particular, a big thanks to Arnout Engelen, Bernhard M. Wiedemann, Daan De Meyer, Evangelos Ribeiro Tzaras, Holger Levsen and Orhun Parmaks z. In addition to this, a number of other changes were made, including:

Upstream patches The Reproducible Builds project detects, dissects and attempts to fix as many currently-unreproducible packages as possible. We endeavour to send all of our patches upstream where appropriate. This month, we wrote a large number of such patches, including:

Reproducibility testing framework The Reproducible Builds project operates a comprehensive testing framework (available at tests.reproducible-builds.org) in order to check packages and other artifacts for reproducibility. In October, a number of changes were made by Holger Levsen:
  • Debian-related changes:
    • Track packages marked as Priority: important in a new package set. [ ][ ]
    • Stop scheduling packages that fail to build from source in bookworm [ ] and bullseye. [ ].
    • Add old releases dashboard link in web navigation. [ ]
    • Permit re-run of the pool_buildinfos script to be re-run for a specific year. [ ]
    • Grant jbglaw access to the osuosl4 node [ ][ ] along with lynxis [ ].
    • Increase RAM on the amd64 Ionos builders from 48 GiB to 64 GiB; thanks IONOS! [ ]
    • Move buster to archived suites. [ ][ ]
    • Reduce the number of arm64 architecture workers from 24 to 16 in order to improve stability [ ], reduce the workers for amd64 from 32 to 28 and, for i386, reduce from 12 down to 8 [ ].
    • Show the entire build history of each Debian package. [ ]
    • Stop scheduling already tested package/version combinations in Debian bookworm. [ ]
  • Snapshot service for rebuilders
    • Add an HTTP-based API endpoint. [ ][ ]
    • Add a Gunicorn instance to serve the HTTP API. [ ]
    • Add an NGINX config [ ][ ][ ][ ]
  • System-health:
    • Detect failures due to HTTP 503 Service Unavailable errors. [ ]
    • Detect failures to update package sets. [ ]
    • Detect unmet dependencies. (This usually occurs with builds of Debian live-build.) [ ]
  • Misc-related changes:
    • do install systemd-ommd on jenkins. [ ]
    • fix harmless typo in squid.conf for codethink04. [ ]
    • fixup: reproducible Debian: add gunicorn service to serve /api for rebuilder-snapshot.d.o. [ ]
    • Increase codethink04 s Squid cache_dir size setting to 16 GiB. [ ]
    • Don t install systemd-oomd as it unfortunately kills sshd [ ]
    • Use debootstrap from backports when commisioning nodes. [ ]
    • Add the live_build_debian_stretch_gnome, debsums-tests_buster and debsums-tests_buster jobs to the zombie list. [ ][ ]
    • Run jekyll build with the --watch argument when building the Reproducible Builds website. [ ]
    • Misc node maintenance. [ ][ ][ ]
Other changes were made as well, however, including Mattia Rizzolo fixing rc.local s Bash syntax so it can actually run [ ], commenting away some file cleanup code that is (potentially) deleting too much [ ] and fixing the html_brekages page for Debian package builds [ ]. Finally, diagnosed and submitted a patch to add a AddEncoding gzip .gz line to the tests.reproducible-builds.org Apache configuration so that Gzip files aren t re-compressed as Gzip which some clients can t deal with (as well as being a waste of time). [ ]

If you are interested in contributing to the Reproducible Builds project, please visit our Contribute page on our website. However, you can get in touch with us via:

28 November 2023

Dirk Eddelbuettel: RcppSimdJson 0.1.11 on CRAN: Maintenance

A new maintenance release 0.1.11 of the RcppSimdJson package is now on CRAN. RcppSimdJson wraps the fantastic and genuinely impressive simdjson library by Daniel Lemire and collaborators. Via very clever algorithmic engineering to obtain largely branch-free code, coupled with modern C++ and newer compiler instructions, it results in parsing gigabytes of JSON parsed per second which is quite mindboggling. The best-case performance is faster than CPU speed as use of parallel SIMD instructions and careful branch avoidance can lead to less than one cpu cycle per byte parsed; see the video of the talk by Daniel Lemire at QCon. This release responds to a CRAN request to address issues now identified by -Wformat -Wformat-security. These are frequently pretty simple changes as it was here: all it took was an call to compileAttributes() from an updated Rcpp version which now injects "%s" as a format string when calling Rf_error(). The (very short) NEWS entry for this release follows.

Changes in version 0.1.11 (2023-11-28)
  • RcppExports.cpp has been regenerated under an update Rcpp to address a print format warning (Dirk in #88).

Courtesy of my CRANberries, there is also a diffstat report for this release. For questions, suggestions, or issues please use the issue tracker at the GitHub repo. If you like this or other open-source work I do, you can now sponsor me at GitHub.

This post by Dirk Eddelbuettel originated on his Thinking inside the box blog. Please report excessive re-aggregation in third-party for-profit settings.

23 November 2023

Freexian Collaborators: Debian Contributions: Preparing for Python 3.12, /usr-merge updates, invalid PEP-440 versions, and more! (by Utkarsh Gupta)

Contributing to Debian is part of Freexian s mission. This article covers the latest achievements of Freexian and their collaborators. All of this is made possible by organizations subscribing to our Long Term Support contracts and consulting services.

urllib3 s old security patch by Stefano Rivera Stefano ran into a test-suite failure in a new Debian package (python-truststore), caused by Debian s patch to urllib3 from a decade ago, making it enable TLS verification by default (remember those days!). Some analysis confirmed that this patch isn t useful any more, and could be removed. While working on the package, Stefano investigated the scope of the urllib3 2.x transition. It looks ready to start, not many packages are affected.

Preparing for Python 3.12 in dh-python by Stefano Rivera We are preparing to start the Python 3.12 transition in Debian. Two of the upstream changes that are going to cause a lot of packages to break could be worked-around in dh-python, so we did:
  • Distutils is no longer shipped in the Python stdlib. Packages need to Build-Depend on python3-setuptools to get a (compatibility shim) distutils. Until that happens, dh-python will Depend on setuptools.
  • A failure to find any tests to execute will now make the unittest runner exit 5, like pytest does. This was our change, to test-suites that have failed to be automatically discovered. It will cause many packages to fail to build, so until they explicitly skip running test suites, dh-python will ignore these failures.

/usr-merge by Helmut Grohne It has become clear that the planned changes to debhelper and systemd.pc cause more rc-bugs. Helmut researched these systematically and filed another stack of patches. At the time of this writing, the uploads would still cause about 40 rc-bugs. A new opt-in helper dh_movetousr has been developed and added to debhelper in trixie and unstable.

debian-printing, by Thorsten Alteholz This month Thorsten adopted two packages, namely rlpr and lprng, and moved them to the debian-printing team. As part of this Thorsten could close eight bugs in the BTS. Thorsten also uploaded a new upstream version of cups, which also meant that eleven bugs could be closed. As package hannah-foo2zjs still depended on the deprecated policykit-1 package, Thorsten changed the dependency list accordingly and could close one RC bug by the following upload.

Invalid PEP-440 Versions in Python Packages by Stefano Rivera Stefano investigated how many packages in Debian (typically Debian-native packages) recorded versions in their packaging metadata (egg-info directories) that weren t valid PEP-440 Python versions. pip is starting to enforce that all versions on the system are valid.

Miscellaneous contributions
  • distro-info-data updates in Debian, due to the new Ubuntu release, by Stefano.
  • DebConf 23 bookkeeping continues, but is winding down. Stefano still spends a little time on it.
  • Utkarsh continues to monitor and help with reimbursements.
  • Helmut continues to maintain architecture bootstrap and accidentally broke pam briefly
  • Anton uploaded boost1.83 and started to prepare a transition to make boost1.83 as a default boost version.
  • Rejuntada Debian UY 2023, a MiniDebConf that will be held in Montevideo, from 9 to 11 November, mainly organized by Santiago.

21 November 2023

Mike Hommey: How I (kind of) killed Mercurial at Mozilla

Did you hear the news? Firefox development is moving from Mercurial to Git. While the decision is far from being mine, and I was barely involved in the small incremental changes that ultimately led to this decision, I feel I have to take at least some responsibility. And if you are one of those who would rather use Mercurial than Git, you may direct all your ire at me. But let's take a step back and review the past 25 years leading to this decision. You'll forgive me for skipping some details and any possible inaccuracies. This is already a long post, while I could have been more thorough, even I think that would have been too much. This is also not an official Mozilla position, only my personal perception and recollection as someone who was involved at times, but mostly an observer from a distance. From CVS to DVCS From its release in 1998, the Mozilla source code was kept in a CVS repository. If you're too young to know what CVS is, let's just say it's an old school version control system, with its set of problems. Back then, it was mostly ubiquitous in the Open Source world, as far as I remember. In the early 2000s, the Subversion version control system gained some traction, solving some of the problems that came with CVS. Incidentally, Subversion was created by Jim Blandy, who now works at Mozilla on completely unrelated matters. In the same period, the Linux kernel development moved from CVS to Bitkeeper, which was more suitable to the distributed nature of the Linux community. BitKeeper had its own problem, though: it was the opposite of Open Source, but for most pragmatic people, it wasn't a real concern because free access was provided. Until it became a problem: someone at OSDL developed an alternative client to BitKeeper, and licenses of BitKeeper were rescinded for OSDL members, including Linus Torvalds (they were even prohibited from purchasing one). Following this fiasco, in April 2005, two weeks from each other, both Git and Mercurial were born. The former was created by Linus Torvalds himself, while the latter was developed by Olivia Mackall, who was a Linux kernel developer back then. And because they both came out of the same community for the same needs, and the same shared experience with BitKeeper, they both were similar distributed version control systems. Interestingly enough, several other DVCSes existed: In this landscape, the major difference Git was making at the time was that it was blazing fast. Almost incredibly so, at least on Linux systems. That was less true on other platforms (especially Windows). It was a game-changer for handling large codebases in a smooth manner. Anyways, two years later, in 2007, Mozilla decided to move its source code not to Bzr, not to Git, not to Subversion (which, yes, was a contender), but to Mercurial. The decision "process" was laid down in two rather colorful blog posts. My memory is a bit fuzzy, but I don't recall that it was a particularly controversial choice. All of those DVCSes were still young, and there was no definite "winner" yet (GitHub hadn't even been founded). It made the most sense for Mozilla back then, mainly because the Git experience on Windows still wasn't there, and that mattered a lot for Mozilla, with its diverse platform support. As a contributor, I didn't think much of it, although to be fair, at the time, I was mostly consuming the source tarballs. Personal preferences Digging through my archives, I've unearthed a forgotten chapter: I did end up setting up both a Mercurial and a Git mirror of the Firefox source repository on alioth.debian.org. Alioth.debian.org was a FusionForge-based collaboration system for Debian developers, similar to SourceForge. It was the ancestor of salsa.debian.org. I used those mirrors for the Debian packaging of Firefox (cough cough Iceweasel). The Git mirror was created with hg-fast-export, and the Mercurial mirror was only a necessary step in the process. By that time, I had converted my Subversion repositories to Git, and switched off SVK. Incidentally, I started contributing to Git around that time as well. I apparently did this not too long after Mozilla switched to Mercurial. As a Linux user, I think I just wanted the speed that Mercurial was not providing. Not that Mercurial was that slow, but the difference between a couple seconds and a couple hundred milliseconds was a significant enough difference in user experience for me to prefer Git (and Firefox was not the only thing I was using version control for) Other people had also similarly created their own mirror, or with other tools. But none of them were "compatible": their commit hashes were different. Hg-git, used by the latter, was putting extra information in commit messages that would make the conversion differ, and hg-fast-export would just not be consistent with itself! My mirror is long gone, and those have not been updated in more than a decade. I did end up using Mercurial, when I got commit access to the Firefox source repository in April 2010. I still kept using Git for my Debian activities, but I now was also using Mercurial to push to the Mozilla servers. I joined Mozilla as a contractor a few months after that, and kept using Mercurial for a while, but as a, by then, long time Git user, it never really clicked for me. It turns out, the sentiment was shared by several at Mozilla. Git incursion In the early 2010s, GitHub was becoming ubiquitous, and the Git mindshare was getting large. Multiple projects at Mozilla were already entirely hosted on GitHub. As for the Firefox source code base, Mozilla back then was kind of a Wild West, and engineers being engineers, multiple people had been using Git, with their own inconvenient workflows involving a local Mercurial clone. The most popular set of scripts was moz-git-tools, to incorporate changes in a local Git repository into the local Mercurial copy, to then send to Mozilla servers. In terms of the number of people doing that, though, I don't think it was a lot of people, probably a few handfuls. On my end, I was still keeping up with Mercurial. I think at that time several engineers had their own unofficial Git mirrors on GitHub, and later on Ehsan Akhgari provided another mirror, with a twist: it also contained the full CVS history, which the canonical Mercurial repository didn't have. This was particularly interesting for engineers who needed to do some code archeology and couldn't get past the 2007 cutoff of the Mercurial repository. I think that mirror ultimately became the official-looking, but really unofficial, mozilla-central repository on GitHub. On a side note, a Mercurial repository containing the CVS history was also later set up, but that didn't lead to something officially supported on the Mercurial side. Some time around 2011~2012, I started to more seriously consider using Git for work myself, but wasn't satisfied with the workflows others had set up for themselves. I really didn't like the idea of wasting extra disk space keeping a Mercurial clone around while using a Git mirror. I wrote a Python script that would use Mercurial as a library to access a remote repository and produce a git-fast-import stream. That would allow the creation of a git repository without a local Mercurial clone. It worked quite well, but it was not able to incrementally update. Other, more complete tools existed already, some of which I mentioned above. But as time was passing and the size and depth of the Mercurial repository was growing, these tools were showing their limits and were too slow for my taste, especially for the initial clone. Boot to Git In the same time frame, Mozilla ventured in the Mobile OS sphere with Boot to Gecko, later known as Firefox OS. What does that have to do with version control? The needs of third party collaborators in the mobile space led to the creation of what is now the gecko-dev repository on GitHub. As I remember it, it was challenging to create, but once it was there, Git users could just clone it and have a working, up-to-date local copy of the Firefox source code and its history... which they could already have, but this was the first officially supported way of doing so. Coincidentally, Ehsan's unofficial mirror was having trouble (to the point of GitHub closing the repository) and was ultimately shut down in December 2013. You'll often find comments on the interwebs about how GitHub has become unreliable since the Microsoft acquisition. I can't really comment on that, but if you think GitHub is unreliable now, rest assured that it was worse in its beginning. And its sustainability as a platform also wasn't a given, being a rather new player. So on top of having this official mirror on GitHub, Mozilla also ventured in setting up its own Git server for greater control and reliability. But the canonical repository was still the Mercurial one, and while Git users now had a supported mirror to pull from, they still had to somehow interact with Mercurial repositories, most notably for the Try server. Git slowly creeping in Firefox build tooling Still in the same time frame, tooling around building Firefox was improving drastically. For obvious reasons, when version control integration was needed in the tooling, Mercurial support was always a no-brainer. The first explicit acknowledgement of a Git repository for the Firefox source code, other than the addition of the .gitignore file, was bug 774109. It added a script to install the prerequisites to build Firefox on macOS (still called OSX back then), and that would print a message inviting people to obtain a copy of the source code with either Mercurial or Git. That was a precursor to current bootstrap.py, from September 2012. Following that, as far as I can tell, the first real incursion of Git in the Firefox source tree tooling happened in bug 965120. A few days earlier, bug 952379 had added a mach clang-format command that would apply clang-format-diff to the output from hg diff. Obviously, running hg diff on a Git working tree didn't work, and bug 965120 was filed, and support for Git was added there. That was in January 2014. A year later, when the initial implementation of mach artifact was added (which ultimately led to artifact builds), Git users were an immediate thought. But while they were considered, it was not to support them, but to avoid actively breaking their workflows. Git support for mach artifact was eventually added 14 months later, in March 2016. From gecko-dev to git-cinnabar Let's step back a little here, back to the end of 2014. My user experience with Mercurial had reached a level of dissatisfaction that was enough for me to decide to take that script from a couple years prior and make it work for incremental updates. That meant finding a way to store enough information locally to be able to reconstruct whatever the incremental updates would be relying on (guess why other tools hid a local Mercurial clone under hood). I got something working rather quickly, and after talking to a few people about this side project at the Mozilla Portland All Hands and seeing their excitement, I published a git-remote-hg initial prototype on the last day of the All Hands. Within weeks, the prototype gained the ability to directly push to Mercurial repositories, and a couple months later, was renamed to git-cinnabar. At that point, as a Git user, instead of cloning the gecko-dev repository from GitHub and switching to a local Mercurial repository whenever you needed to push to a Mercurial repository (i.e. the aforementioned Try server, or, at the time, for reviews), you could just clone and push directly from/to Mercurial, all within Git. And it was fast too. You could get a full clone of mozilla-central in less than half an hour, when at the time, other similar tools would take more than 10 hours (needless to say, it's even worse now). Another couple months later (we're now at the end of April 2015), git-cinnabar became able to start off a local clone of the gecko-dev repository, rather than clone from scratch, which could be time consuming. But because git-cinnabar and the tool that was updating gecko-dev weren't producing the same commits, this setup was cumbersome and not really recommended. For instance, if you pushed something to mozilla-central with git-cinnabar from a gecko-dev clone, it would come back with a different commit hash in gecko-dev, and you'd have to deal with the divergence. Eventually, in April 2020, the scripts updating gecko-dev were switched to git-cinnabar, making the use of gecko-dev alongside git-cinnabar a more viable option. Ironically(?), the switch occurred to ease collaboration with KaiOS (you know, the mobile OS born from the ashes of Firefox OS). Well, okay, in all honesty, when the need of syncing in both directions between Git and Mercurial (we only had ever synced from Mercurial to Git) came up, I nudged Mozilla in the direction of git-cinnabar, which, in my (biased but still honest) opinion, was the more reliable option for two-way synchronization (we did have regular conversion problems with hg-git, nothing of the sort has happened since the switch). One Firefox repository to rule them all For reasons I don't know, Mozilla decided to use separate Mercurial repositories as "branches". With the switch to the rapid release process in 2011, that meant one repository for nightly (mozilla-central), one for aurora, one for beta, and one for release. And with the addition of Extended Support Releases in 2012, we now add a new ESR repository every year. Boot to Gecko also had its own branches, and so did Fennec (Firefox for Mobile, before Android). There are a lot of them. And then there are also integration branches, where developer's work lands before being merged in mozilla-central (or backed out if it breaks things), always leaving mozilla-central in a (hopefully) good state. Only one of them remains in use today, though. I can only suppose that the way Mercurial branches work was not deemed practical. It is worth noting, though, that Mercurial branches are used in some cases, to branch off a dot-release when the next major release process has already started, so it's not a matter of not knowing the feature exists or some such. In 2016, Gregory Szorc set up a new repository that would contain them all (or at least most of them), which eventually became what is now the mozilla-unified repository. This would e.g. simplify switching between branches when necessary. 7 years later, for some reason, the other "branches" still exist, but most developers are expected to be using mozilla-unified. Mozilla's CI also switched to using mozilla-unified as base repository. Honestly, I'm not sure why the separate repositories are still the main entry point for pushes, rather than going directly to mozilla-unified, but it probably comes down to switching being work, and not being a top priority. Also, it probably doesn't help that working with multiple heads in Mercurial, even (especially?) with bookmarks, can be a source of confusion. To give an example, if you aren't careful, and do a plain clone of the mozilla-unified repository, you may not end up on the latest mozilla-central changeset, but rather, e.g. one from beta, or some other branch, depending which one was last updated. Hosting is simple, right? Put your repository on a server, install hgweb or gitweb, and that's it? Maybe that works for... Mercurial itself, but that repository "only" has slightly over 50k changesets and less than 4k files. Mozilla-central has more than an order of magnitude more changesets (close to 700k) and two orders of magnitude more files (more than 700k if you count the deleted or moved files, 350k if you count the currently existing ones). And remember, there are a lot of "duplicates" of this repository. And I didn't even mention user repositories and project branches. Sure, it's a self-inflicted pain, and you'd think it could probably(?) be mitigated with shared repositories. But consider the simple case of two repositories: mozilla-central and autoland. You make autoland use mozilla-central as a shared repository. Now, you push something new to autoland, it's stored in the autoland datastore. Eventually, you merge to mozilla-central. Congratulations, it's now in both datastores, and you'd need to clean-up autoland if you wanted to avoid the duplication. Now, you'd think mozilla-unified would solve these issues, and it would... to some extent. Because that wouldn't cover user repositories and project branches briefly mentioned above, which in GitHub parlance would be considered as Forks. So you'd want a mega global datastore shared by all repositories, and repositories would need to only expose what they really contain. Does Mercurial support that? I don't think so (okay, I'll give you that: even if it doesn't, it could, but that's extra work). And since we're talking about a transition to Git, does Git support that? You may have read about how you can link to a commit from a fork and make-pretend that it comes from the main repository on GitHub? At least, it shows a warning, now. That's essentially the architectural reason why. So the actual answer is that Git doesn't support it out of the box, but GitHub has some backend magic to handle it somehow (and hopefully, other things like Gitea, Girocco, Gitlab, etc. have something similar). Now, to come back to the size of the repository. A repository is not a static file. It's a server with which you negotiate what you have against what it has that you want. Then the server bundles what you asked for based on what you said you have. Or in the opposite direction, you negotiate what you have that it doesn't, you send it, and the server incorporates what you sent it. Fortunately the latter is less frequent and requires authentication. But the former is more frequent and CPU intensive. Especially when pulling a large number of changesets, which, incidentally, cloning is. "But there is a solution for clones" you might say, which is true. That's clonebundles, which offload the CPU intensive part of cloning to a single job scheduled regularly. Guess who implemented it? Mozilla. But that only covers the cloning part. We actually had laid the ground to support offloading large incremental updates and split clones, but that never materialized. Even with all that, that still leaves you with a server that can display file contents, diffs, blames, provide zip archives of a revision, and more, all of which are CPU intensive in their own way. And these endpoints are regularly abused, and cause extra load to your servers, yes plural, because of course a single server won't handle the load for the number of users of your big repositories. And because your endpoints are abused, you have to close some of them. And I'm not mentioning the Try repository with its tens of thousands of heads, which brings its own sets of problems (and it would have even more heads if we didn't fake-merge them once in a while). Of course, all the above applies to Git (and it only gained support for something akin to clonebundles last year). So, when the Firefox OS project was stopped, there wasn't much motivation to continue supporting our own Git server, Mercurial still being the official point of entry, and git.mozilla.org was shut down in 2016. The growing difficulty of maintaining the status quo Slowly, but steadily in more recent years, as new tooling was added that needed some input from the source code manager, support for Git was more and more consistently added. But at the same time, as people left for other endeavors and weren't necessarily replaced, or more recently with layoffs, resources allocated to such tooling have been spread thin. Meanwhile, the repository growth didn't take a break, and the Try repository was becoming an increasing pain, with push times quite often exceeding 10 minutes. The ongoing work to move Try pushes to Lando will hide the problem under the rug, but the underlying problem will still exist (although the last version of Mercurial seems to have improved things). On the flip side, more and more people have been relying on Git for Firefox development, to my own surprise, as I didn't really push for that to happen. It just happened organically, by ways of git-cinnabar existing, providing a compelling experience to those who prefer Git, and, I guess, word of mouth. I was genuinely surprised when I recently heard the use of Git among moz-phab users had surpassed a third. I did, however, occasionally orient people who struggled with Mercurial and said they were more familiar with Git, towards git-cinnabar. I suspect there's a somewhat large number of people who never realized Git was a viable option. But that, on its own, can come with its own challenges: if you use git-cinnabar without being backed by gecko-dev, you'll have a hard time sharing your branches on GitHub, because you can't push to a fork of gecko-dev without pushing your entire local repository, as they have different commit histories. And switching to gecko-dev when you weren't already using it requires some extra work to rebase all your local branches from the old commit history to the new one. Clone times with git-cinnabar have also started to go a little out of hand in the past few years, but this was mitigated in a similar manner as with the Mercurial cloning problem: with static files that are refreshed regularly. Ironically, that made cloning with git-cinnabar faster than cloning with Mercurial. But generating those static files is increasingly time-consuming. As of writing, generating those for mozilla-unified takes close to 7 hours. I was predicting clone times over 10 hours "in 5 years" in a post from 4 years ago, I wasn't too far off. With exponential growth, it could still happen, although to be fair, CPUs have improved since. I will explore the performance aspect in a subsequent blog post, alongside the upcoming release of git-cinnabar 0.7.0-b1. I don't even want to check how long it now takes with hg-git or git-remote-hg (they were already taking more than a day when git-cinnabar was taking a couple hours). I suppose it's about time that I clarify that git-cinnabar has always been a side-project. It hasn't been part of my duties at Mozilla, and the extent to which Mozilla supports git-cinnabar is in the form of taskcluster workers on the community instance for both git-cinnabar CI and generating those clone bundles. Consequently, that makes the above git-cinnabar specific issues a Me problem, rather than a Mozilla problem. Taking the leap I can't talk for the people who made the proposal to move to Git, nor for the people who put a green light on it. But I can at least give my perspective. Developers have regularly asked why Mozilla was still using Mercurial, but I think it was the first time that a formal proposal was laid out. And it came from the Engineering Workflow team, responsible for issue tracking, code reviews, source control, build and more. It's easy to say "Mozilla should have chosen Git in the first place", but back in 2007, GitHub wasn't there, Bitbucket wasn't there, and all the available options were rather new (especially compared to the then 21 years-old CVS). I think Mozilla made the right choice, all things considered. Had they waited a couple years, the story might have been different. You might say that Mozilla stayed with Mercurial for so long because of the sunk cost fallacy. I don't think that's true either. But after the biggest Mercurial repository hosting service turned off Mercurial support, and the main contributor to Mercurial going their own way, it's hard to ignore that the landscape has evolved. And the problems that we regularly encounter with the Mercurial servers are not going to get any better as the repository continues to grow. As far as I know, all the Mercurial repositories bigger than Mozilla's are... not using Mercurial. Google has its own closed-source server, and Facebook has another of its own, and it's not really public either. With resources spread thin, I don't expect Mozilla to be able to continue supporting a Mercurial server indefinitely (although I guess Octobus could be contracted to give a hand, but is that sustainable?). Mozilla, being a champion of Open Source, also doesn't live in a silo. At some point, you have to meet your contributors where they are. And the Open Source world is now majoritarily using Git. I'm sure the vast majority of new hires at Mozilla in the past, say, 5 years, know Git and have had to learn Mercurial (although they arguably didn't need to). Even within Mozilla, with thousands(!) of repositories on GitHub, Firefox is now actually the exception rather than the norm. I should even actually say Desktop Firefox, because even Mobile Firefox lives on GitHub (although Fenix is moving back in together with Desktop Firefox, and the timing is such that that will probably happen before Firefox moves to Git). Heck, even Microsoft moved to Git! With a significant developer base already using Git thanks to git-cinnabar, and all the constraints and problems I mentioned previously, it actually seems natural that a transition (finally) happens. However, had git-cinnabar or something similarly viable not existed, I don't think Mozilla would be in a position to take this decision. On one hand, it probably wouldn't be in the current situation of having to support both Git and Mercurial in the tooling around Firefox, nor the resource constraints related to that. But on the other hand, it would be farther from supporting Git and being able to make the switch in order to address all the other problems. But... GitHub? I hope I made a compelling case that hosting is not as simple as it can seem, at the scale of the Firefox repository. It's also not Mozilla's main focus. Mozilla has enough on its plate with the migration of existing infrastructure that does rely on Mercurial to understandably not want to figure out the hosting part, especially with limited resources, and with the mixed experience hosting both Mercurial and git has been so far. After all, GitHub couldn't even display things like the contributors' graph on gecko-dev until recently, and hosting is literally their job! They still drop the ball on large blames (thankfully we have searchfox for those). Where does that leave us? Gitlab? For those criticizing GitHub for being proprietary, that's probably not open enough. Cloud Source Repositories? "But GitHub is Microsoft" is a complaint I've read a lot after the announcement. Do you think Google hosting would have appealed to these people? Bitbucket? I'm kind of surprised it wasn't in the list of providers that were considered, but I'm also kind of glad it wasn't (and I'll leave it at that). I think the only relatively big hosting provider that could have made the people criticizing the choice of GitHub happy is Codeberg, but I hadn't even heard of it before it was mentioned in response to Mozilla's announcement. But really, with literal thousands of Mozilla repositories already on GitHub, with literal tens of millions repositories on the platform overall, the pragmatic in me can't deny that it's an attractive option (and I can't stress enough that I wasn't remotely close to the room where the discussion about what choice to make happened). "But it's a slippery slope". I can see that being a real concern. LLVM also moved its repository to GitHub (from a (I think) self-hosted Subversion server), and ended up moving off Bugzilla and Phabricator to GitHub issues and PRs four years later. As an occasional contributor to LLVM, I hate this move. I hate the GitHub review UI with a passion. At least, right now, GitHub PRs are not a viable option for Mozilla, for their lack of support for security related PRs, and the more general shortcomings in the review UI. That doesn't mean things won't change in the future, but let's not get too far ahead of ourselves. The move to Git has just been announced, and the migration has not even begun yet. Just because Mozilla is moving the Firefox repository to GitHub doesn't mean it's locked in forever or that all the eggs are going to be thrown into one basket. If bridges need to be crossed in the future, we'll see then. So, what's next? The official announcement said we're not expecting the migration to really begin until six months from now. I'll swim against the current here, and say this: the earlier you can switch to git, the earlier you'll find out what works and what doesn't work for you, whether you already know Git or not. While there is not one unique workflow, here's what I would recommend anyone who wants to take the leap off Mercurial right now: As there is no one-size-fits-all workflow, I won't tell you how to organize yourself from there. I'll just say this: if you know the Mercurial sha1s of your previous local work, you can create branches for them with:
$ git branch <branch_name> $(git cinnabar hg2git <hg_sha1>)
At this point, you should have everything available on the Git side, and you can remove the .hg directory. Or move it into some empty directory somewhere else, just in case. But don't leave it here, it will only confuse the tooling. Artifact builds WILL be confused, though, and you'll have to ./mach configure before being able to do anything. You may also hit bug 1865299 if your working tree is older than this post. If you have any problem or question, you can ping me on #git-cinnabar or #git on Matrix. I'll put the instructions above somewhere on wiki.mozilla.org, and we can collaboratively iterate on them. Now, what the announcement didn't say is that the Git repository WILL NOT be gecko-dev, doesn't exist yet, and WON'T BE COMPATIBLE (trust me, it'll be for the better). Why did I make you do all the above, you ask? Because that won't be a problem. I'll have you covered, I promise. The upcoming release of git-cinnabar 0.7.0-b1 will have a way to smoothly switch between gecko-dev and the future repository (incidentally, that will also allow to switch from a pure git-cinnabar clone to a gecko-dev one, for the git-cinnabar users who have kept reading this far). What about git-cinnabar? With Mercurial going the way of the dodo at Mozilla, my own need for git-cinnabar will vanish. Legitimately, this begs the question whether it will still be maintained. I can't answer for sure. I don't have a crystal ball. However, the needs of the transition itself will motivate me to finish some long-standing things (like finalizing the support for pushing merges, which is currently behind an experimental flag) or implement some missing features (support for creating Mercurial branches). Git-cinnabar started as a Python script, it grew a sidekick implemented in C, which then incorporated some Rust, which then cannibalized the Python script and took its place. It is now close to 90% Rust, and 10% C (if you don't count the code from Git that is statically linked to it), and has sort of become my Rust playground (it's also, I must admit, a mess, because of its history, but it's getting better). So the day to day use with Mercurial is not my sole motivation to keep developing it. If it were, it would stay stagnant, because all the features I need are there, and the speed is not all that bad, although I know it could be better. Arguably, though, git-cinnabar has been relatively stagnant feature-wise, because all the features I need are there. So, no, I don't expect git-cinnabar to die along Mercurial use at Mozilla, but I can't really promise anything either. Final words That was a long post. But there was a lot of ground to cover. And I still skipped over a bunch of things. I hope I didn't bore you to death. If I did and you're still reading... what's wrong with you? ;) So this is the end of Mercurial at Mozilla. So long, and thanks for all the fish. But this is also the beginning of a transition that is not easy, and that will not be without hiccups, I'm sure. So fasten your seatbelts (plural), and welcome the change. To circle back to the clickbait title, did I really kill Mercurial at Mozilla? Of course not. But it's like I stumbled upon a few sparks and tossed a can of gasoline on them. I didn't start the fire, but I sure made it into a proper bonfire... and now it has turned into a wildfire. And who knows? 15 years from now, someone else might be looking back at how Mozilla picked Git at the wrong time, and that, had we waited a little longer, we would have picked some yet to come new horse. But hey, that's the tech cycle for you.

12 November 2023

Petter Reinholdtsen: New and improved sqlcipher in Debian for accessing Signal database

For a while now I wanted to have direct access to the Signal database of messages and channels of my Desktop edition of Signal. I prefer the enforced end to end encryption of Signal these days for my communication with friends and family, to increase the level of safety and privacy as well as raising the cost of the mass surveillance government and non-government entities practice these days. In August I came across a nice recipe on how to use sqlcipher to extract statistics from the Signal database explaining how to do this. Unfortunately this did not work with the version of sqlcipher in Debian. The sqlcipher package is a "fork" of the sqlite package with added support for encrypted databases. Sadly the current Debian maintainer announced more than three years ago that he did not have time to maintain sqlcipher, so it seemed unlikely to be upgraded by the maintainer. I was reluctant to take on the job myself, as I have very limited experience maintaining shared libraries in Debian. After waiting and hoping for a few months, I gave up the last week, and set out to update the package. In the process I orphaned it to make it more obvious for the next person looking at it that the package need proper maintenance. The version in Debian was around five years old, and quite a lot of changes had taken place upstream into the Debian maintenance git repository. After spending a few days importing the new upstream versions, realising that upstream did not care much for SONAME versioning as I saw library symbols being both added and removed with minor version number changes to the project, I concluded that I had to do a SONAME bump of the library package to avoid surprising the reverse dependencies. I even added a simple autopkgtest script to ensure the package work as intended. Dug deep into the hole of learning shared library maintenance, I set out a few days ago to upload the new version to Debian experimental to see what the quality assurance framework in Debian had to say about the result. The feedback told me the pacakge was not too shabby, and yesterday I uploaded the latest version to Debian unstable. It should enter testing today or tomorrow, perhaps delayed by a small library transition. Armed with a new version of sqlcipher, I can now have a look at the SQL database in ~/.config/Signal/sql/db.sqlite. First, one need to fetch the encryption key from the Signal configuration using this simple JSON extraction command:
/usr/bin/jq -r '."key"' ~/.config/Signal/config.json
Assuming the result from that command is 'secretkey', which is a hexadecimal number representing the key used to encrypt the database. Next, one can now connect to the database and inject the encryption key for access via SQL to fetch information from the database. Here is an example dumping the database structure:
% sqlcipher ~/.config/Signal/sql/db.sqlite
sqlite> PRAGMA key = "x'secretkey'";
sqlite> .schema
CREATE TABLE sqlite_stat1(tbl,idx,stat);
CREATE TABLE conversations(
      id STRING PRIMARY KEY ASC,
      json TEXT,
      active_at INTEGER,
      type STRING,
      members TEXT,
      name TEXT,
      profileName TEXT
    , profileFamilyName TEXT, profileFullName TEXT, e164 TEXT, serviceId TEXT, groupId TEXT, profileLastFetchedAt INTEGER);
CREATE TABLE identityKeys(
      id STRING PRIMARY KEY ASC,
      json TEXT
    );
CREATE TABLE items(
      id STRING PRIMARY KEY ASC,
      json TEXT
    );
CREATE TABLE sessions(
      id TEXT PRIMARY KEY,
      conversationId TEXT,
      json TEXT
    , ourServiceId STRING, serviceId STRING);
CREATE TABLE attachment_downloads(
    id STRING primary key,
    timestamp INTEGER,
    pending INTEGER,
    json TEXT
  );
CREATE TABLE sticker_packs(
    id TEXT PRIMARY KEY,
    key TEXT NOT NULL,
    author STRING,
    coverStickerId INTEGER,
    createdAt INTEGER,
    downloadAttempts INTEGER,
    installedAt INTEGER,
    lastUsed INTEGER,
    status STRING,
    stickerCount INTEGER,
    title STRING
  , attemptedStatus STRING, position INTEGER DEFAULT 0 NOT NULL, storageID STRING, storageVersion INTEGER, storageUnknownFields BLOB, storageNeedsSync
      INTEGER DEFAULT 0 NOT NULL);
CREATE TABLE stickers(
    id INTEGER NOT NULL,
    packId TEXT NOT NULL,
    emoji STRING,
    height INTEGER,
    isCoverOnly INTEGER,
    lastUsed INTEGER,
    path STRING,
    width INTEGER,
    PRIMARY KEY (id, packId),
    CONSTRAINT stickers_fk
      FOREIGN KEY (packId)
      REFERENCES sticker_packs(id)
      ON DELETE CASCADE
  );
CREATE TABLE sticker_references(
    messageId STRING,
    packId TEXT,
    CONSTRAINT sticker_references_fk
      FOREIGN KEY(packId)
      REFERENCES sticker_packs(id)
      ON DELETE CASCADE
  );
CREATE TABLE emojis(
    shortName TEXT PRIMARY KEY,
    lastUsage INTEGER
  );
CREATE TABLE messages(
        rowid INTEGER PRIMARY KEY ASC,
        id STRING UNIQUE,
        json TEXT,
        readStatus INTEGER,
        expires_at INTEGER,
        sent_at INTEGER,
        schemaVersion INTEGER,
        conversationId STRING,
        received_at INTEGER,
        source STRING,
        hasAttachments INTEGER,
        hasFileAttachments INTEGER,
        hasVisualMediaAttachments INTEGER,
        expireTimer INTEGER,
        expirationStartTimestamp INTEGER,
        type STRING,
        body TEXT,
        messageTimer INTEGER,
        messageTimerStart INTEGER,
        messageTimerExpiresAt INTEGER,
        isErased INTEGER,
        isViewOnce INTEGER,
        sourceServiceId TEXT, serverGuid STRING NULL, sourceDevice INTEGER, storyId STRING, isStory INTEGER
        GENERATED ALWAYS AS (type IS 'story'), isChangeCreatedByUs INTEGER NOT NULL DEFAULT 0, isTimerChangeFromSync INTEGER
        GENERATED ALWAYS AS (
          json_extract(json, '$.expirationTimerUpdate.fromSync') IS 1
        ), seenStatus NUMBER default 0, storyDistributionListId STRING, expiresAt INT
        GENERATED ALWAYS
        AS (ifnull(
          expirationStartTimestamp + (expireTimer * 1000),
          9007199254740991
        )), shouldAffectActivity INTEGER
        GENERATED ALWAYS AS (
          type IS NULL
          OR
          type NOT IN (
            'change-number-notification',
            'contact-removed-notification',
            'conversation-merge',
            'group-v1-migration',
            'keychange',
            'message-history-unsynced',
            'profile-change',
            'story',
            'universal-timer-notification',
            'verified-change'
          )
        ), shouldAffectPreview INTEGER
        GENERATED ALWAYS AS (
          type IS NULL
          OR
          type NOT IN (
            'change-number-notification',
            'contact-removed-notification',
            'conversation-merge',
            'group-v1-migration',
            'keychange',
            'message-history-unsynced',
            'profile-change',
            'story',
            'universal-timer-notification',
            'verified-change'
          )
        ), isUserInitiatedMessage INTEGER
        GENERATED ALWAYS AS (
          type IS NULL
          OR
          type NOT IN (
            'change-number-notification',
            'contact-removed-notification',
            'conversation-merge',
            'group-v1-migration',
            'group-v2-change',
            'keychange',
            'message-history-unsynced',
            'profile-change',
            'story',
            'universal-timer-notification',
            'verified-change'
          )
        ), mentionsMe INTEGER NOT NULL DEFAULT 0, isGroupLeaveEvent INTEGER
        GENERATED ALWAYS AS (
          type IS 'group-v2-change' AND
          json_array_length(json_extract(json, '$.groupV2Change.details')) IS 1 AND
          json_extract(json, '$.groupV2Change.details[0].type') IS 'member-remove' AND
          json_extract(json, '$.groupV2Change.from') IS NOT NULL AND
          json_extract(json, '$.groupV2Change.from') IS json_extract(json, '$.groupV2Change.details[0].aci')
        ), isGroupLeaveEventFromOther INTEGER
        GENERATED ALWAYS AS (
          isGroupLeaveEvent IS 1
          AND
          isChangeCreatedByUs IS 0
        ), callId TEXT
        GENERATED ALWAYS AS (
          json_extract(json, '$.callId')
        ));
CREATE TABLE sqlite_stat4(tbl,idx,neq,nlt,ndlt,sample);
CREATE TABLE jobs(
        id TEXT PRIMARY KEY,
        queueType TEXT STRING NOT NULL,
        timestamp INTEGER NOT NULL,
        data STRING TEXT
      );
CREATE TABLE reactions(
        conversationId STRING,
        emoji STRING,
        fromId STRING,
        messageReceivedAt INTEGER,
        targetAuthorAci STRING,
        targetTimestamp INTEGER,
        unread INTEGER
      , messageId STRING);
CREATE TABLE senderKeys(
        id TEXT PRIMARY KEY NOT NULL,
        senderId TEXT NOT NULL,
        distributionId TEXT NOT NULL,
        data BLOB NOT NULL,
        lastUpdatedDate NUMBER NOT NULL
      );
CREATE TABLE unprocessed(
        id STRING PRIMARY KEY ASC,
        timestamp INTEGER,
        version INTEGER,
        attempts INTEGER,
        envelope TEXT,
        decrypted TEXT,
        source TEXT,
        serverTimestamp INTEGER,
        sourceServiceId STRING
      , serverGuid STRING NULL, sourceDevice INTEGER, receivedAtCounter INTEGER, urgent INTEGER, story INTEGER);
CREATE TABLE sendLogPayloads(
        id INTEGER PRIMARY KEY ASC,
        timestamp INTEGER NOT NULL,
        contentHint INTEGER NOT NULL,
        proto BLOB NOT NULL
      , urgent INTEGER, hasPniSignatureMessage INTEGER DEFAULT 0 NOT NULL);
CREATE TABLE sendLogRecipients(
        payloadId INTEGER NOT NULL,
        recipientServiceId STRING NOT NULL,
        deviceId INTEGER NOT NULL,
        PRIMARY KEY (payloadId, recipientServiceId, deviceId),
        CONSTRAINT sendLogRecipientsForeignKey
          FOREIGN KEY (payloadId)
          REFERENCES sendLogPayloads(id)
          ON DELETE CASCADE
      );
CREATE TABLE sendLogMessageIds(
        payloadId INTEGER NOT NULL,
        messageId STRING NOT NULL,
        PRIMARY KEY (payloadId, messageId),
        CONSTRAINT sendLogMessageIdsForeignKey
          FOREIGN KEY (payloadId)
          REFERENCES sendLogPayloads(id)
          ON DELETE CASCADE
      );
CREATE TABLE preKeys(
        id STRING PRIMARY KEY ASC,
        json TEXT
      , ourServiceId NUMBER
        GENERATED ALWAYS AS (json_extract(json, '$.ourServiceId')));
CREATE TABLE signedPreKeys(
        id STRING PRIMARY KEY ASC,
        json TEXT
      , ourServiceId NUMBER
        GENERATED ALWAYS AS (json_extract(json, '$.ourServiceId')));
CREATE TABLE badges(
        id TEXT PRIMARY KEY,
        category TEXT NOT NULL,
        name TEXT NOT NULL,
        descriptionTemplate TEXT NOT NULL
      );
CREATE TABLE badgeImageFiles(
        badgeId TEXT REFERENCES badges(id)
          ON DELETE CASCADE
          ON UPDATE CASCADE,
        'order' INTEGER NOT NULL,
        url TEXT NOT NULL,
        localPath TEXT,
        theme TEXT NOT NULL
      );
CREATE TABLE storyReads (
        authorId STRING NOT NULL,
        conversationId STRING NOT NULL,
        storyId STRING NOT NULL,
        storyReadDate NUMBER NOT NULL,
        PRIMARY KEY (authorId, storyId)
      );
CREATE TABLE storyDistributions(
        id STRING PRIMARY KEY NOT NULL,
        name TEXT,
        senderKeyInfoJson STRING
      , deletedAtTimestamp INTEGER, allowsReplies INTEGER, isBlockList INTEGER, storageID STRING, storageVersion INTEGER, storageUnknownFields BLOB, storageNeedsSync INTEGER);
CREATE TABLE storyDistributionMembers(
        listId STRING NOT NULL REFERENCES storyDistributions(id)
          ON DELETE CASCADE
          ON UPDATE CASCADE,
        serviceId STRING NOT NULL,
        PRIMARY KEY (listId, serviceId)
      );
CREATE TABLE uninstalled_sticker_packs (
        id STRING NOT NULL PRIMARY KEY,
        uninstalledAt NUMBER NOT NULL,
        storageID STRING,
        storageVersion NUMBER,
        storageUnknownFields BLOB,
        storageNeedsSync INTEGER NOT NULL
      );
CREATE TABLE groupCallRingCancellations(
        ringId INTEGER PRIMARY KEY,
        createdAt INTEGER NOT NULL
      );
CREATE TABLE IF NOT EXISTS 'messages_fts_data'(id INTEGER PRIMARY KEY, block BLOB);
CREATE TABLE IF NOT EXISTS 'messages_fts_idx'(segid, term, pgno, PRIMARY KEY(segid, term)) WITHOUT ROWID;
CREATE TABLE IF NOT EXISTS 'messages_fts_content'(id INTEGER PRIMARY KEY, c0);
CREATE TABLE IF NOT EXISTS 'messages_fts_docsize'(id INTEGER PRIMARY KEY, sz BLOB);
CREATE TABLE IF NOT EXISTS 'messages_fts_config'(k PRIMARY KEY, v) WITHOUT ROWID;
CREATE TABLE edited_messages(
        messageId STRING REFERENCES messages(id)
          ON DELETE CASCADE,
        sentAt INTEGER,
        readStatus INTEGER
      , conversationId STRING);
CREATE TABLE mentions (
        messageId REFERENCES messages(id) ON DELETE CASCADE,
        mentionAci STRING,
        start INTEGER,
        length INTEGER
      );
CREATE TABLE kyberPreKeys(
        id STRING PRIMARY KEY NOT NULL,
        json TEXT NOT NULL, ourServiceId NUMBER
        GENERATED ALWAYS AS (json_extract(json, '$.ourServiceId')));
CREATE TABLE callsHistory (
        callId TEXT PRIMARY KEY,
        peerId TEXT NOT NULL, -- conversation id (legacy)   uuid   groupId   roomId
        ringerId TEXT DEFAULT NULL, -- ringer uuid
        mode TEXT NOT NULL, -- enum "Direct"   "Group"
        type TEXT NOT NULL, -- enum "Audio"   "Video"   "Group"
        direction TEXT NOT NULL, -- enum "Incoming"   "Outgoing
        -- Direct: enum "Pending"   "Missed"   "Accepted"   "Deleted"
        -- Group: enum "GenericGroupCall"   "OutgoingRing"   "Ringing"   "Joined"   "Missed"   "Declined"   "Accepted"   "Deleted"
        status TEXT NOT NULL,
        timestamp INTEGER NOT NULL,
        UNIQUE (callId, peerId) ON CONFLICT FAIL
      );
[ dropped all indexes to save space in this blog post ]
CREATE TRIGGER messages_on_view_once_update AFTER UPDATE ON messages
      WHEN
        new.body IS NOT NULL AND new.isViewOnce = 1
      BEGIN
        DELETE FROM messages_fts WHERE rowid = old.rowid;
      END;
CREATE TRIGGER messages_on_insert AFTER INSERT ON messages
      WHEN new.isViewOnce IS NOT 1 AND new.storyId IS NULL
      BEGIN
        INSERT INTO messages_fts
          (rowid, body)
        VALUES
          (new.rowid, new.body);
      END;
CREATE TRIGGER messages_on_delete AFTER DELETE ON messages BEGIN
        DELETE FROM messages_fts WHERE rowid = old.rowid;
        DELETE FROM sendLogPayloads WHERE id IN (
          SELECT payloadId FROM sendLogMessageIds
          WHERE messageId = old.id
        );
        DELETE FROM reactions WHERE rowid IN (
          SELECT rowid FROM reactions
          WHERE messageId = old.id
        );
        DELETE FROM storyReads WHERE storyId = old.storyId;
      END;
CREATE VIRTUAL TABLE messages_fts USING fts5(
        body,
        tokenize = 'signal_tokenizer'
      );
CREATE TRIGGER messages_on_update AFTER UPDATE ON messages
      WHEN
        (new.body IS NULL OR old.body IS NOT new.body) AND
         new.isViewOnce IS NOT 1 AND new.storyId IS NULL
      BEGIN
        DELETE FROM messages_fts WHERE rowid = old.rowid;
        INSERT INTO messages_fts
          (rowid, body)
        VALUES
          (new.rowid, new.body);
      END;
CREATE TRIGGER messages_on_insert_insert_mentions AFTER INSERT ON messages
      BEGIN
        INSERT INTO mentions (messageId, mentionAci, start, length)
        
    SELECT messages.id, bodyRanges.value ->> 'mentionAci' as mentionAci,
      bodyRanges.value ->> 'start' as start,
      bodyRanges.value ->> 'length' as length
    FROM messages, json_each(messages.json ->> 'bodyRanges') as bodyRanges
    WHERE bodyRanges.value ->> 'mentionAci' IS NOT NULL
  
        AND messages.id = new.id;
      END;
CREATE TRIGGER messages_on_update_update_mentions AFTER UPDATE ON messages
      BEGIN
        DELETE FROM mentions WHERE messageId = new.id;
        INSERT INTO mentions (messageId, mentionAci, start, length)
        
    SELECT messages.id, bodyRanges.value ->> 'mentionAci' as mentionAci,
      bodyRanges.value ->> 'start' as start,
      bodyRanges.value ->> 'length' as length
    FROM messages, json_each(messages.json ->> 'bodyRanges') as bodyRanges
    WHERE bodyRanges.value ->> 'mentionAci' IS NOT NULL
  
        AND messages.id = new.id;
      END;
sqlite>
Finally I have the tool needed to inspect and process Signal messages that I need, without using the vendor provided client. Now on to transforming it to a more useful format. As usual, if you use Bitcoin and want to show your support of my activities, please send Bitcoin donations to my address 15oWEoG9dUPovwmUL9KWAnYRtNJEkP1u1b.

5 November 2023

Thorsten Alteholz: My Debian Activities in October 2023

FTP master This month I accepted 361 and rejected 34 packages. The overall number of packages that got accepted was 362. Debian LTS This was my hundred-twelfth month that I did some work for the Debian LTS initiative, started by Raphael Hertzog at Freexian. During my allocated time I uploaded: Unfortunately upstream still could not resolve whether the patch for CVE-2023-42118 of libspf2 is valid, so no progress happened here.
I also continued to work on bind9 and try to understand why some tests fail. Last but not least I did some days of frontdesk duties and took part in the LTS meeting. Debian ELTS This month was the sixty-third ELTS month. During my allocated time I uploaded: I also continued to work on bind9 and as with the version in LTS, I try to understand why some tests fail. Last but not least I did some days of frontdesk duties . Debian Printing This month I uploaded a new upstream version of: Within the context of preserving old printing packages, I adopted: If you know of any other package that is also needed and still maintained by the QA team, please tell me. I also uploaded new upstream version of packages or uploaded a package to fix one or the other issue: This work is generously funded by Freexian! Debian Mobcom This month I uploaded a package to fix one or the other issue: Other stuff This month I uploaded new upstream version of packages, did a source upload for the transition or uploaded it to fix one or the other issue:

29 October 2023

Aigars Mahinovs: Figuring out finances part 4

At the end of the last part of this, we got a Home Assistant OS installation that contains in itself a Firefly III instance and that contains all the current financial information. Now I will try to connect the two. While it could be nice to create a fully-featured integration for Firefly III to Home Assistant to communicate all interesting values and events, I have an interest on programming a more advanced data point calculation for my budget needs, so a less generic, but more flexible approch is a better one for me. So I was quite interested when among the addons in the Home Assistant Addon Store I saw AppDaemon - a way to simply integrate arbitrary Python processing with Home Assistant. Let's see if that can do what I want. For start, after reading the tutorial , I wanted to create a simple script that would use Firefly III REST API to read the current balance of my main account and then send that to Home Assistant as a sensor value, which then can be displayed on a dashboard. As a quick try I modified the provided hello_world.py that is included in the default AppDaemon installation:
import requests
from datetime import datetime
import appdaemon.plugins.hass.hassapi as hass
app_token = "<FIREFLY_PERSONAL_ACCESS_TOKEN>"
firefly_url = "<FIREFLY_URL>"
class HelloWorld(hass.Hass):
    def initialize(self):
        self.run_every(self.set_asset, "now", 60 * 60)
    def set_asset(self, kwargs):
        ent = self.get_entity("sensor.firefly3_asset_sparkasse_main")
        if not ent.exists():
            ent.add(
                state=0.0,
                attributes= 
                    "native_value": 0.0,
                    "native_unit_of_measurement": "EUR",
                    "state_class": "measurement",
                    "device_class": "monetary",
                    "current_balance_date": datetime.now(),
                 )
        r = requests.get(
            firefly_url + "/api/v1/accounts?type=asset",
            headers= 
                "Authorization": "Bearer " + app_token,
                "Accept": "application/vnd.api+json",
                "Content-Type": "application/json",
         )
        data = r.json()
        for account in data["data"]:
            if not "attributes" in account or "name" not in account["attributes"]:
                continue
            if account["attributes"]["name"] != "Sparkasse giro":
                continue
            self.log("Account :" + str(account["attributes"]))
            ent.set_state(
                state=account["attributes"]["current_balance"],
                attributes= 
                    "native_value": account["attributes"]["current_balance"],
                    "current_balance_date": datetime.fromisoformat(account["attributes"]["current_balance_date"]),
                 )
            self.log("Entity updated")
It uses a URL and personal access token to access Firefly III API, gets the asset accounts information, then extracts info about current balance and balance date of my main account and then creates and/or updates a "sensor" value into Home Assistant. This sensor is with metadata marked as a monetary value and as a measurement. This makes Home Assistant track this value in the database as a graphable changing value. I modified the file using the File Editor addon to edit the /config/appdaemon/apps/hello.py file. Each time the file is saved it is reloaded and logs can be seen in the AppDaemon Logs section - main_log for logging messages or error_log if there is a crash. Useful to know that requests library is included, but it hard to see in the docks what else is included or if there is an easy way to install extra Python packages. This is already a very nice basis for custom value insertion into Home Assistant - whatever you can with a Python script extract or calculate, you can also inject into Home Assistant. With even this simple approach you can monitor balances, budgets, piggy-banks, bill payment status and even sum of transactions in particular catories in a particular time window. Especially interesting data can be found in the insight section of the Firefly III API. The script above uses a trigger like self.run_every(self.set_asset, "now", 60 * 60) to simply run once per hour. The data in Firefly will not be updated too often anyway, at least not until we figure out how to make bank connection run automatically without user interaction and not screw up already existing transactions along the way. In theory a webhook API of the Firefly III could be used to trigger the data update instantly when any transaction is created or updated. Possibly even using Home Assistant webhook integration. Hmmm. Maybe. Who am I kiddind? I am going to make that work, for sure! :D But first - how about figuring out the future? So what I want to do? In short, I want to predict what will be the balance on my main account just before the next months salary comes in. To do this I would:
  • take the current balance of the main account
  • if this months salary is not paid out yet, then add that into the balance
  • deduct all still unpaid bills that are due between now and the target date
  • if the credit card account has not yet been reset to the main account, deduct current amount on the cards
  • if credit card account has been reset, but not from main account deducted yet, deduct the reset amount
To do that I need to use the Firefly API to read: current account info, status of all bills including next due date and amount, transfer transactions between credit cards and main account and something that would store the expected salary date and amount. Ideally I'd use a recurring transaction or a income bill for this, but Firefly is not really cooperating with that. The easiest would be just to hardcode that in the script itself. And this is what I have come up with so far. To make the development process easier, I separated put the params for the API key and salary info and app params for the month to predict for, and predict both this and next months balances at the same time. I edited the script locally with Neovim and also ran it locally with a few mocks, uploading to Home Assistant via the SSH addon when the local executions looked good. So what's next? Well, need to somewhat automate the sync with the bank (if at all possible). And for sure take a regular database and config backup :D

22 October 2023

Daniel Lange: Removing the New Event Button from Thunderbird v115 Calendar

Thunderbird in Debian stable (Bookworm) has received Thunderbird v115.3.1 as a security update. With it comes "Supernova", a UI redesign. There is a Mozilla blogpost with a walk-through of the new UI. Unfortunately it features a super eye-catching "New Message" button that - thankfully - can be disabled. Even the whole space above the email folder pane can be recovered by disabling the folder pane header at Burger Menu ( ) -> View -> Folders -> Folder Pane Header. Unfortunately there is no way to remove the same eye-catching "New Event" button for the Calendar view via a UI setting. Thunderbird New event button, German locale This needs a user CSS file to override the button as non-visible. To make it process the user CSS Thunderbird needs a config setting to be enabled:
  1. Burger Menu ( ) -> Settings -> General
  2. Scroll down all the way
  3. Click the Config editor... button on the bottom right
  4. Accept that hell will freeze over because you configure software
  5. Search for toolkit.legacyUserProfileCustomizations.stylesheets
  6. Toggle the value to true to enable the user CSS
You can manually add user_pref("toolkit.legacyUserProfileCustomizations.stylesheets", true); to ~/.thunderbird/abcdefgh.default/prefs.js to the same effect (do this while Thunderbird is not running; replace abcdefgh with your Thunderbird profile ID). Now create a new directory ~/.thunderbird/abcdefgh.default/chrome/, again replacing abcdefgh with your profile ID. Inside the new directory create a userChrome.css file with the following content:
/* Hide Calendar New Event button */
#primaryButtonSidePanel
display: none !important;
Restart Thunderbird. And enjoy less visual obstruction when using the Calendar.

Next.

Previous.